You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors. #63

lightingvector · 2024-04-16T23:04:53Z

I managed to finetune the mini-gemini mixtral model, however post finetuning I am unable to infer with the model. I tried to launch a model worker per described on the repo: python -m minigemini.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path Mini-Gemini-mixtral/

Then I get after a long wait:

You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards:   0%|                                                                    | 0/36 [00:00<?, ?it/s]
Loading checkpoint shards:   0%|                                                                    | 0/36 [00:00<?, ?it/s]
2024-04-16 23:00:00 | ERROR | stderr | 
2024-04-16 23:00:00 | ERROR | stderr | Traceback (most recent call last):
2024-04-16 23:00:00 | ERROR | stderr |   File "<frozen runpy>", line 198, in _run_module_as_main
2024-04-16 23:00:00 | ERROR | stderr |   File "<frozen runpy>", line 88, in _run_code
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/minigemini/serve/model_worker.py", line 389, in <module>
2024-04-16 23:00:00 | ERROR | stderr |     worker = ModelWorker(args.controller_address,
2024-04-16 23:00:00 | ERROR | stderr |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/minigemini/serve/model_worker.py", line 76, in __init__
2024-04-16 23:00:00 | ERROR | stderr |     self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model(
2024-04-16 23:00:00 | ERROR | stderr |                                                                          ^^^^^^^^^^^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/minigemini/model/builder.py", line 76, in load_pretrained_model
2024-04-16 23:00:00 | ERROR | stderr |     model = MiniGeminiLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs)
2024-04-16 23:00:00 | ERROR | stderr |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3706, in from_pretrained
2024-04-16 23:00:00 | ERROR | stderr |     ) = cls._load_pretrained_model(
2024-04-16 23:00:00 | ERROR | stderr |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4091, in _load_pretrained_model
2024-04-16 23:00:00 | ERROR | stderr |     state_dict = load_state_dict(shard_file)
2024-04-16 23:00:00 | ERROR | stderr |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr |   File "/home/paperspace/MiniGemini/venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 505, in load_state_dict
2024-04-16 23:00:00 | ERROR | stderr |     if metadata.get("format") not in ["pt", "tf", "flax"]:
2024-04-16 23:00:00 | ERROR | stderr |        ^^^^^^^^^^^^
2024-04-16 23:00:00 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'get'

Could this be due to the conversion from zero to fp32 post training ?
I did run zero to fp32 but saved as sharded safetensors instead of a pytorch_model.bin.

The text was updated successfully, but these errors were encountered:

yanwei-li · 2024-04-21T15:09:23Z

Hi, please rename your finetuned model with the word "8x7b", which is used to load the mixtral model in L68 of model/builder.py. Or, you can just modify the loading regulation in L68 of model/builder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors. #63

You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors. #63

lightingvector commented Apr 16, 2024 •

edited

yanwei-li commented Apr 21, 2024

You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors. #63

You are using a model of type mini_gemini_mixtral to instantiate a model of type mini_gemini. This is not supported for all configurations of models and can yield errors. #63

Comments

lightingvector commented Apr 16, 2024 • edited

yanwei-li commented Apr 21, 2024

lightingvector commented Apr 16, 2024 •

edited