[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. #1033

Blaizzy · 2024-04-25T20:43:23Z

Describe the bug
When I try to quantize a VLM model that use SigLIP it throws a value error because it has intermediate size of 4304 which is not divisible by 64 or 128.

To Reproduce

Include code snippet

pip install -U mlx-vlm

python -m mlx_vlm.convert \
    --hf-path qnguyen3/nanoLLaVA \
    -q

Expected behavior
Sucessfully quantize model.

Desktop (please complete the following information):

OS Version: MacOS 14.4.1
Version 0.11.1

Additional context
Add any other context about the problem here.

Traceback

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/prince_canuma/Documents/Projects/LLMs/mlx-vlm/mlx_vlm/convert.py", line 62, in <module>
    main()
  File "/Users/prince_canuma/Documents/Projects/LLMs/mlx-vlm/mlx_vlm/convert.py", line 58, in main
    convert(**vars(args))
  File "/Users/prince_canuma/Documents/Projects/LLMs/mlx-vlm/mlx_vlm/utils.py", line 540, in convert
    weights, config = quantize_model(model, config, q_group_size, q_bits)
  File "/Users/prince_canuma/Documents/Projects/LLMs/mlx-vlm/mlx_vlm/utils.py", line 452, in quantize_model
    nn.quantize(model, q_group_size, q_bits, class_predicate=class_predicate)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/nn/layers/quantized.py", line 51, in quantize
    leaves = tree_map_with_path(_maybe_quantize, leaves, is_leaf=Module.is_module)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 87, in tree_map_with_path
    return TreeType(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 88, in <genexpr>
    tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 95, in tree_map_with_path
    return {
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 96, in <dictcomp>
    k: tree_map_with_path(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/utils.py", line 83, in tree_map_with_path
    return fn(path, tree, *rest)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/nn/layers/quantized.py", line 42, in _maybe_quantize
    return QuantizedLinear.from_linear(m, group_size, bits)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/nn/layers/quantized.py", line 226, in from_linear
    ql = cls(input_dims, output_dims, False, group_size, bits)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.10/site-packages/mlx/nn/layers/quantized.py", line 185, in __init__
    self.weight, self.scales, self.biases = mx.quantize(weight, group_size, bits)
ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. However the provided  matrix has shape (1152,4304)

awni · 2024-04-25T20:53:49Z

It's not a bug.. at the risk of being redundant, the last dimension of the matrix has to be divisible by the quantization group size. For the size 4304 there is no supported group size which divides it (e.g. none of 32, 64, 128).

It's not on our roadmap to support irregular sizes... but we can leave this issue open to help prioritize if it's something we should consider in the future.

s-smits · 2024-04-25T21:36:42Z

It can be divided by 16, would an implementation for that be complicated to implement?

Blaizzy · 2024-04-25T22:46:17Z

It's not a bug.. at the risk of being redundant, the last dimension of the matrix has to be divisible by the quantization group size. For the size 4304 there is no supported group size which divides it (e.g. none of 32, 64, 128).

It's not on our roadmap to support irregular sizes... but we can leave this issue open to help prioritize if it's something we should consider in the future.

Yes, it's not a bug. It's more of a feature request / clarification. Because all SigLip based VLM are not quantisable because of this, which include Idefics 2, NanoLlava and Deepseek VL.

Blaizzy · 2024-04-25T22:49:34Z

Is there a way to skip particular target layer or Block X in the model in MLX?

Not all layers of the same type like class_predicate does.

awni · 2024-04-25T23:14:58Z

You can use class_predicate for that. Just put the condition you want in the predicate. For example if you are trying to skip weights of a certain shape:

class_predicate = lambda p, m: isinstance(m, nn.Linear) and m.weight != (x, y)

Blaizzy · 2024-04-25T23:24:22Z

Thank you very much, I will give it a try ASAP!

Blaizzy · 2024-04-25T23:57:47Z

It works wonders! 💯

Also found a better way, skipping the entire block:

class_predicate = lambda p, m: isinstance(m, nn.Linear) and p.split('.')[0] != "vision_tower"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. #1033

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. #1033

Blaizzy commented Apr 25, 2024

awni commented Apr 25, 2024

s-smits commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

Blaizzy commented Apr 25, 2024 •

edited

awni commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. #1033

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64. #1033

Comments

Blaizzy commented Apr 25, 2024

awni commented Apr 25, 2024

s-smits commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

Blaizzy commented Apr 25, 2024 • edited

awni commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

Blaizzy commented Apr 25, 2024

Blaizzy commented Apr 25, 2024 •

edited