Add instructions on running llava-v1.6-mistral-7b #1115

aliencaocao · 2024-02-10T11:59:55Z

After many hours of debugging, I finally got llava-v1.6-mistral-7b to work fully on SGLang inference backend.

This PR adds the relevant instructions to README.md, which references a PR I made on Hugging Face containing all the patches needed to make loading work.

Closes #1114
Closes #1112
Closes #1179
Also closes (from SGLang repo: sgl-project/sglang#128 )

Summary of patches:

create added_tokens.json and put:

{
  "<image>": 32000,
  "<pad>": 32001
}

this was from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/added_tokens.json which is linked by sgl-project/sglang#127 (comment)

in config.json, change LlavaMistralForCausalLM to LlavaLlamaForCausalLM, "model_type": "llava_mistral", to "model_type": "llava"
this was from [Bug] liuhaotian/llava-v1.6-mistral-7b doesn't load sgl-project/sglang#128 (comment)
change generation config.json to add a line before the transformer_version:
"pad_token_id": 32001,
Add preprocessor_config.json from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/preprocessor_config.json

{
	"crop_size": {
	  "height": 336,
	  "width": 336
	},
	"do_center_crop": true,
	"do_convert_rgb": true,
	"do_normalize": true,
	"do_rescale": true,
	"do_resize": true,
	"image_mean": [
	  0.48145466,
	  0.4578275,
	  0.40821073
	],
	"image_processor_type": "CLIPImageProcessor",
	"image_std": [
	  0.26862954,
	  0.26130258,
	  0.27577711
	],
	"processor_class": "LlavaProcessor",
	"resample": 3,
	"rescale_factor": 0.00392156862745098,
	"size": {
	  "shortest_edge": 336
	}
  }

in special_token_map.json add

  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },

this was from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/special_tokens_map.json

tokenizer_config.json change to https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/tokenizer_config.json
Diffs:

    "32000": {
      "content": "<image>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "32001": {
      "content": "<pad>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },

and

  "legacy": false,
  "model_max_length": 4096,
  "pad_token": "<pad>",
  "padding_side": "right",
  "processor_class": "LlavaProcessor",

But need to keep the "chat_template" row from original one (vicuna one dont have)

tokenizer.json use the original one BUT in added_tokens, append:

    {
      "id": 32000,
      "content": "<image>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 32001,
      "content": "<pad>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    }

RonanKMcGovern · 2024-02-14T11:51:37Z

FWIW I've pushed what I think are these patches to huggingface here

RonanKMcGovern · 2024-02-14T15:39:38Z

Perhaps I'm doing something wrong, but these patches result in <pad> being in the response:

Prompt: [INST] <image>
What do you see in this picture? [/INST]
<s> The image shows a wooden chess set on a wooden table. There are three chess pieces visible: a rook, a knight, and a pawn. The rook and knight are standing upright, while the pawn is lying on its side. The pieces appear to be made of a dark wood, and the table has a light wood finish. The shadow of the chess pieces is<pad><pad><pad> the table,<pad><pad><pad><pad><pad><pad>, indicating that the light source is coming from the direction the shadow is cast. </s>

aliencaocao · 2024-02-14T16:07:05Z

i didnt observe this using the chair example. Try deleting the pad related additions? i actually dont have a concrete evidence saying pad is even necessary.

aliencaocao · 2024-02-19T03:24:02Z

Yea pad seem to be extra as they use unk as pad, so i guess should delete the pad related entries and set pad token id in various files to 0 (unk)
Been running this and no issues so far

RylanSchaeffer · 2024-03-02T22:35:03Z

@RonanKMcGovern thanks for posting the patched version on huggingface! Quick question: did you update to include @aliencaocao 's recent pad solution?

RonanKMcGovern · 2024-03-03T00:32:01Z

Yeah pad is included but actually that is a bad idea and breaks generation. Actually I’m not sure the patches are needed, rather, the loading script from the evaluation file needs to be used to load the model. It’s messy. There’s a vid I put on Trelis YouTube showing a bit on training. Probably there are other better ways too.

…

On Sat 2 Mar 2024 at 22:35, Rylan Schaeffer ***@***.***> wrote: @RonanKMcGovern <https://github.com/RonanKMcGovern> thanks for posting the patched version on huggingface! Quick question: did you update to include @aliencaocao <https://github.com/aliencaocao> 's recent pad solution? — Reply to this email directly, view it on GitHub <#1115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CSPUZ2Q2WJ6MXKR4W3YWJH2JAVCNFSM6AAAAABDCVGQ52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZUHEZDONZVGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

RylanSchaeffer · 2024-03-03T06:37:50Z

@RonanKMcGovern can you link the video?

RonanKMcGovern · 2024-03-03T10:49:55Z

https://youtu.be/eIziN2QUt8U?si=RybNCreMN2ua7faP

…

On Sun 3 Mar 2024 at 06:38, Rylan Schaeffer ***@***.***> wrote: @RonanKMcGovern <https://github.com/RonanKMcGovern> can you link the video? — Reply to this email directly, view it on GitHub <#1115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CRL7DW33JOXE66MNOTYWLAMXAVCNFSM6AAAAABDCVGQ52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZVGA3DGNBVGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ppx-hub · 2024-03-06T14:25:22Z

After many hours of debugging, I finally got llava-v1.6-mistral-7b to work fully on SGLang inference backend.

This PR adds the relevant instructions to README.md, which references a PR I made on Hugging Face containing all the patches needed to make loading work.

Closes #1114 Closes #1112 Closes #1179 Also closes (from SGLang repo: sgl-project/sglang#128

Summary of patches:

create added_tokens.json and put:
{
  "<image>": 32000,
  "<pad>": 32001
}
this was from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/added_tokens.json which is linked by sgl-project/sglang#127 (comment)

in config.json, change LlavaMistralForCausalLM to LlavaLlamaForCausalLM, "model_type": "llava_mistral", to "model_type": "llava"
this was from [Bug] liuhaotian/llava-v1.6-mistral-7b doesn't load sgl-project/sglang#128 (comment)

change generation _onfig.json to add a line before the transformer_version:
"pad_token_id": 32001,

Add preprocessor_config.json from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/preprocessor_config.json
{
	"crop_size": {
	  "height": 336,
	  "width": 336
	},
	"do_center_crop": true,
	"do_convert_rgb": true,
	"do_normalize": true,
	"do_rescale": true,
	"do_resize": true,
	"image_mean": [
	  0.48145466,
	  0.4578275,
	  0.40821073
	],
	"image_processor_type": "CLIPImageProcessor",
	"image_std": [
	  0.26862954,
	  0.26130258,
	  0.27577711
	],
	"processor_class": "LlavaProcessor",
	"resample": 3,
	"rescale_factor": 0.00392156862745098,
	"size": {
	  "shortest_edge": 336
	}
  }
in special_token_map.json add
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
this was from https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/special_tokens_map.json

tokenizer_config.json change to https://huggingface.co/SurfaceData/llava-v1.6-vicuna-7b-processor/blob/main/tokenizer_config.json
Diffs:
    "32000": {
      "content": "<image>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "32001": {
      "content": "<pad>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
and
  "legacy": false,
  "model_max_length": 4096,
  "pad_token": "<pad>",
  "padding_side": "right",
  "processor_class": "LlavaProcessor",
But need to keep the "chat_template" row from original one (vicuna one dont have)

tokenizer.json use the original one BUT in added_tokens, append:
    {
      "id": 32000,
      "content": "<image>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 32001,
      "content": "<pad>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    }

Thanks, the same applies to solving ”Cannot launch SGLang demo on llava-v1.5-13b“

Add instructions on running llava-v1.6-mistral-7b

e3eb609

aliencaocao mentioned this pull request Feb 14, 2024

[Usage] Error loading v1.6 models #1122

Open

aliencaocao mentioned this pull request Mar 2, 2024

[Question] Is it possible to run llava-v1.6-mistral-7b with transformers? #1179

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add instructions on running llava-v1.6-mistral-7b #1115

Add instructions on running llava-v1.6-mistral-7b #1115

aliencaocao commented Feb 10, 2024 •

edited

RonanKMcGovern commented Feb 14, 2024 •

edited

RonanKMcGovern commented Feb 14, 2024

aliencaocao commented Feb 14, 2024

aliencaocao commented Feb 19, 2024

RylanSchaeffer commented Mar 2, 2024

RonanKMcGovern commented Mar 3, 2024 via email

RylanSchaeffer commented Mar 3, 2024

RonanKMcGovern commented Mar 3, 2024 via email

ppx-hub commented Mar 6, 2024

Add instructions on running llava-v1.6-mistral-7b #1115

Are you sure you want to change the base?

Add instructions on running llava-v1.6-mistral-7b #1115

Conversation

aliencaocao commented Feb 10, 2024 • edited

RonanKMcGovern commented Feb 14, 2024 • edited

RonanKMcGovern commented Feb 14, 2024

aliencaocao commented Feb 14, 2024

aliencaocao commented Feb 19, 2024

RylanSchaeffer commented Mar 2, 2024

RonanKMcGovern commented Mar 3, 2024 via email

RylanSchaeffer commented Mar 3, 2024

RonanKMcGovern commented Mar 3, 2024 via email

ppx-hub commented Mar 6, 2024

aliencaocao commented Feb 10, 2024 •

edited

RonanKMcGovern commented Feb 14, 2024 •

edited