Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with saving and loading low bit BLIP-2 model #10892

Open
wayfeng opened this issue Apr 26, 2024 · 1 comment
Open

Issue with saving and loading low bit BLIP-2 model #10892

wayfeng opened this issue Apr 26, 2024 · 1 comment
Assignees

Comments

@wayfeng
Copy link

wayfeng commented Apr 26, 2024

The original BLIP2-OPT-6.7B model takes more than 30GB RAM to load and convert. So I want to save the compressed model then load it directly from another PC with limited RAM. The saving succeeded. But loading failed.

from transformers import Blip2Processor, Blip2ForConditionalGeneration
from ipex_llm import optimize_model

model_id = "Salesforce/blip2-opt-6.7b" # “Salesforce/blip2-opt-2.7b” 
processor = Blip2Processor.from_pretrained(model_id)
model = Blip2ForConditionalGeneration.from_pretrained(model_id)

device = 'xpu'
optimized_model = optimize_model(model, device=device)

model_path =optimized-blip2optimized_model.save_low_bit(model_path )
processor.save_pretrained(model_path)
$ l optimized-blip2 
total 4.7G
drwxrwxr-x 2 wayne wayne 4.0K Apr 25 16:55 .
drwxrwxr-x 6 wayne wayne 4.0K Apr 26 08:40 ..
-rw-rw-r-- 1 wayne wayne   42 Apr 25 16:54 bigdl_config.json
-rw-rw-r-- 1 wayne wayne  942 Apr 25 16:53 config.json
-rw-rw-r-- 1 wayne wayne  136 Apr 25 16:53 generation_config.json
-rw-rw-r-- 1 wayne wayne 446K Apr 25 16:55 merges.txt
-rw-rw-r-- 1 wayne wayne 4.7G Apr 25 16:54 model.safetensors
-rw-rw-r-- 1 wayne wayne  432 Apr 25 16:55 preprocessor_config.json
-rw-rw-r-- 1 wayne wayne  548 Apr 25 16:55 special_tokens_map.json
-rw-rw-r-- 1 wayne wayne  708 Apr 25 16:55 tokenizer_config.json
-rw-rw-r-- 1 wayne wayne 2.1M Apr 25 16:55 tokenizer.json
-rw-rw-r-- 1 wayne wayne 780K Apr 25 16:55 vocab.json
copied_model = load_low_bit(copied_model, model_path)

2024-04-26 08:39:58,752 - INFO - Converting the current model to sym_int4 format......
2024-04-26 08:39:59,115 - ERROR - 

****************************Usage Error************************
Error no file named pytorch_model.bin found in directory optimized-blip2.
2024-04-26 08:39:59,116 - ERROR - 

****************************Call Stack*************************
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[19], line 1
----> 1 copied_model = load_low_bit(copied_model, 'optimized-blip2')

File [~/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/optimize.py:178](http://case-wlc-01.sh.intel.com:8888/home/wayne/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/optimize.py#line=177), in load_low_bit(model, model_path)
    175     qtype = ggml_tensor_qtype[low_bit]
    176     model = ggml_convert_low_bit(model, qtype=qtype, convert_shape_only=True)
--> 178 resolved_archive_file, is_sharded = extract_local_archive_file(model_path, subfolder="")
    179 if is_sharded:
    180     # For now only shards transformers models
    181     # can run in this branch.
    182     resolved_archive_file, _ = \
    183         get_local_shard_files(model_path,
    184                               resolved_archive_file,
    185                               subfolder="")

File [~/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/transformers/utils.py:83](http://case-wlc-01.sh.intel.com:8888/home/wayne/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/transformers/utils.py#line=82), in extract_local_archive_file(pretrained_model_name_or_path, subfolder, variant)
     81     return archive_file, is_sharded
     82 else:
---> 83     invalidInputError(False,
     84                       f"Error no file named {_add_variant(WEIGHTS_NAME, variant)}"
     85                       " found in directory"
     86                       f" {pretrained_model_name_or_path}.")

File [~/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/utils/common/log4Error.py:32](http://case-wlc-01.sh.intel.com:8888/home/wayne/.env/ipex-llm/lib/python3.10/site-packages/ipex_llm/utils/common/log4Error.py#line=31), in invalidInputError(condition, errMsg, fixMsg)
     30 if not condition:
     31     outputUserMessage(errMsg, fixMsg)
---> 32     raise RuntimeError(errMsg)

RuntimeError: Error no file named pytorch_model.bin found in directory optimized-blip2.
@pengyb2001
Copy link
Contributor

pengyb2001 commented Apr 29, 2024

Hi there, I tried on my arc a770 machine, my env setting is:

transformers=4.31.0
-----------------------------------------------------------------
Name: ipex-llm
Version: 2.1.0b20240421

I first downloaded the Salesforce/blip2-opt-6.7b to my machine

arda@arda-arc05:/mnt/disk1/models$ ls /mnt/disk1/models/blip2
config.json                       pytorch_model-00003-of-00004.bin  tokenizer_config.json
merges.txt                        pytorch_model-00004-of-00004.bin  tokenizer.json
preprocessor_config.json          pytorch_model.bin.index.json      vocab.json
pytorch_model-00001-of-00004.bin  README.md
pytorch_model-00002-of-00004.bin  special_tokens_map.json

and then used an absolute path to load and transform the model. I did not encounter the issue you mentioned.

from transformers import Blip2Processor, Blip2ForConditionalGeneration
from ipex_llm import optimize_model

model_id = "/mnt/disk1/models/blip2"  
processor = Blip2Processor.from_pretrained(model_id)
model = Blip2ForConditionalGeneration.from_pretrained(model_id)

device = 'xpu'
optimized_model = optimize_model(model, device=device)

model_path = "optimized-blip2"
optimized_model.save_low_bit(model_path )
processor.save_pretrained(model_path)
arda@arda-arc05:/mnt/disk1/models$ ls /mnt/disk1/models/optimized-blip2
bigdl_config.json  preprocessor_config.json  tokenizer_config.json
config.json        pytorch_model.bin         tokenizer.json
merges.txt         special_tokens_map.json   vocab.json

You might want to verify that the model you downloaded is complete. And note that you should download the original pytorch_model.
As for loading converted model, I use the following code:

from ipex_llm.optimize import low_memory_init, load_low_bit
from transformers import Blip2Processor, Blip2ForConditionalGeneration
model_id = "/mnt/disk1/models/optimized-blip2"
with low_memory_init():
    model = Blip2ForConditionalGeneration.from_pretrained(model_id)
model = load_low_bit(model, model_id)
print("Model loaded successfully!")

And no error occurred.

arda@arda-arc05:/mnt/disk1/models$ python blip2.py
/opt/anaconda3/envs/mingyu-llm-gpu/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
2024-04-29 10:47:15,896 - INFO - intel_extension_for_pytorch auto imported
2024-04-29 10:47:16,064 - INFO - Converting the current model to sym_int4 format......
Model loaded successfully!

You can refer to relevant API in ipex-llm/python/llm/src/ipex_llm/optimize.py at main · intel-analytics/ipex-llm (github.com) to write the code of loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants