Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InternLM-XComposer2-4KHD-7B 多卡推理报错 #265

Open
ly19970621 opened this issue Apr 12, 2024 · 2 comments
Open

InternLM-XComposer2-4KHD-7B 多卡推理报错 #265

ly19970621 opened this issue Apr 12, 2024 · 2 comments
Assignees

Comments

@ly19970621
Copy link

机器环境:4 * RTX 4090
运行命令:CUDA_VISIBLE_DEVICES=0,1 python examples/example_chat.py --num_gpus 2
出现如下错误:
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.44s/it]
Some weights of InternLMXComposer2ForCausalLM were not initialized from the model checkpoint at /home/ai_group/model/internlm-xcomposer2/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b and are newly initialized: ['vit.vision_tower.vision_model.post_layernorm.bias', 'vit.vision_tower.vision_model.post_layernorm.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/ai_group/liuy026/multi_modality/InternLM-XComposer/examples/example_chat.py", line 26, in <module> model = dispatch_model(model, device_map=device_map)
File "/home/ai_group/anaconda3/envs/liuy026-py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 351, in dispatch_model check_device_map(model, device_map)
File "/home/ai_group/anaconda3/envs/liuy026-py310/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1393, in check_device_map raise ValueError(
ValueError: The device_map provided does not give any device for the following parameters: plora_glb_GN, plora_sub_GN

@ztfmars
Copy link

ztfmars commented Apr 16, 2024

(1) i have fix the file examples/utils.py as following:

    device_map = {
        'vit': 0,
        'vision_proj': 0,
        'model.tok_embeddings': 0,
        'plora_glb_GN': num_gpus - 1,
        'plora_sub_GN':num_gpus - 1,
        'model.norm': num_gpus - 1,
        'output': num_gpus - 1,
    }

it works for seperating computing into differ gpus.
@ly19970621
(2) InternLM-XComposer2-4KHD-7B infer costs too much GPU rams up to almost 80G. i have a A800 and it can only be the infer server, that's too scary!

  • pip list
Package                   Version
------------------------- ------------
accelerate                0.29.2
addict                    2.4.0
aiofiles                  23.2.1
aiohttp                   3.9.4
aiosignal                 1.3.1
aliyun-python-sdk-core    2.15.1
aliyun-python-sdk-kms     2.16.2
altair                    5.3.0
annotated-types           0.6.0
anyio                     4.3.0
async-timeout             4.0.3
attrs                     23.2.0
auto_gptq                 0.7.1
certifi                   2022.12.7
cffi                      1.16.0
charset-normalizer        2.1.1
click                     8.1.7
cmake                     3.25.0
contourpy                 1.2.1
crcmod                    1.7
cryptography              42.0.5
cycler                    0.12.1
datasets                  2.18.0
deepspeed                 0.14.1
dill                      0.3.8
einops                    0.7.0
exceptiongroup            1.2.0
fastapi                   0.110.1
ffmpy                     0.3.2
filelock                  3.9.0
flash_attn                2.5.7
fonttools                 4.51.0
frozenlist                1.4.1
fsspec                    2024.2.0
gast                      0.5.4
gekko                     1.1.1
gradio                    4.13.0
gradio_client             0.8.0
h11                       0.14.0
hjson                     3.1.0
httpcore                  1.0.5
httpx                     0.27.0
huggingface-hub           0.22.2
idna                      3.4
importlib_metadata        7.1.0
importlib_resources       6.4.0
Jinja2                    3.1.2
jmespath                  0.10.0
jsonschema                4.21.1
jsonschema-specifications 2023.12.1
kiwisolver                1.4.5
lit                       15.0.7
markdown-it-py            3.0.0
markdown2                 2.4.10
MarkupSafe                2.1.3
matplotlib                3.8.4
mdurl                     0.1.2
modelscope                1.13.3
mpmath                    1.3.0
multidict                 6.0.5
multiprocess              0.70.16
networkx                  3.2.1
ninja                     1.11.1.1
numpy                     1.24.1
orjson                    3.10.1
oss2                      2.18.4
packaging                 24.0
pandas                    2.2.2
peft                      0.10.0
pillow                    10.2.0
pip                       24.0
platformdirs              4.2.0
psutil                    5.9.8
py-cpuinfo                9.0.0
pyarrow                   15.0.2
pyarrow-hotfix            0.6
pycparser                 2.22
pycryptodome              3.20.0
pydantic                  2.7.0
pydantic_core             2.18.1
pydub                     0.25.1
Pygments                  2.17.2
pynvml                    11.5.0
pyparsing                 3.1.2
python-dateutil           2.9.0.post0
python-multipart          0.0.9
pytz                      2024.1
PyYAML                    6.0.1
referencing               0.34.0
regex                     2023.12.25
requests                  2.28.1
rich                      13.7.1
rouge                     1.0.1
rpds-py                   0.18.0
safetensors               0.4.3
scipy                     1.13.0
semantic-version          2.10.0
sentencepiece             0.1.99
setuptools                69.5.1
shellingham               1.5.4
simplejson                3.19.2
six                       1.16.0
sniffio                   1.3.1
sortedcontainers          2.4.0
starlette                 0.37.2
sympy                     1.12
timm                      0.4.12
tokenizers                0.13.3
tomli                     2.0.1
tomlkit                   0.12.0
toolz                     0.12.1
torch                     2.0.1+cu117
torchaudio                2.0.2+cu117
torchvision               0.15.2+cu117
tqdm                      4.66.2
transformers              4.33.2
triton                    2.0.0
typer                     0.12.3
typing_extensions         4.8.0
tzdata                    2024.1
urllib3                   1.26.13
uvicorn                   0.29.0
websockets                11.0.3
wheel                     0.43.0
XlsxWriter                3.1.2
xxhash                    3.4.1
yapf                      0.40.2
yarl                      1.9.4
zipp                      3.18.1

  • code scripts
import sys
sys.path.insert(0, '.')
sys.path.insert(0, '..')
import argparse
import torch
from modelscope import snapshot_download, AutoModel, AutoTokenizer
from examples.utils import auto_configure_device_map

torch.set_grad_enabled(False)

parser = argparse.ArgumentParser()
parser.add_argument("--num_gpus", default=1, type=int)
parser.add_argument("--dtype", default='fp16', type=str)
args = parser.parse_args()

# init model and tokenizer
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b')
model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

if args.dtype == 'fp16':
    model.half().cuda()
elif args.dtype == 'fp32':
    model.cuda()

if args.num_gpus > 1:
    from accelerate import dispatch_model
    device_map = auto_configure_device_map(args.num_gpus)
    model = dispatch_model(model, device_map=device_map)

###############
# First Round
###############
query = '<ImageHere>Illustrate the fine details present in the image'
image = 'examples/4khd_example.webp'
with torch.cuda.amp.autocast():
  response, his = model.chat(tokenizer, query=query, image=image, hd_num=55, history=[], do_sample=False, num_beams=3)

print("**"*10)
print("-------> first round")
print(response)
print("**"*10)
# The image is a vibrant and colorful infographic that showcases 7 graphic design trends that will dominate in 2021. The infographic is divided into 7 sections, each representing a different trend. 
# Starting from the top, the first section focuses on "Muted Color Palettes", highlighting the use of muted colors in design.
# The second section delves into "Simple Data Visualizations", emphasizing the importance of easy-to-understand data visualizations. 
# The third section introduces "Geometric Shapes Everywhere", showcasing the use of geometric shapes in design. 
# The fourth section discusses "Flat Icons and Illustrations", explaining how flat icons and illustrations are being used in design. 
# The fifth section is dedicated to "Classic Serif Fonts", illustrating the resurgence of classic serif fonts in design.
# The sixth section explores "Social Media Slide Decks", illustrating how slide decks are being used on social media. 
# Finally, the seventh section focuses on "Text Heavy Videos", illustrating the trend of using text-heavy videos in design. 
# Each section is filled with relevant images and text, providing a comprehensive overview of the 7 graphic design trends that will dominate in 2021.

###############
# Second Round
###############
query1 = 'what is the detailed explanation of the third part.'
with torch.cuda.amp.autocast():
  response, _ = model.chat(tokenizer, query=query1, image=image, hd_num=55, history=his, do_sample=False, num_beams=3)

print("**"*10)
print("-------> second round")
print(response)
print("**"*10)
# The third part of the infographic is about "Geometric Shapes Everywhere". It explains that last year, designers used a lot of
# flowing and abstract shapes in their designs. However, this year, they have been replaced with rigid, hard-edged geometric
# shapes and patterns. The hard edges of a geometric shape create a great contrast against muted colors.

so is there any thing wrong for me to use a 7b model like this ?
can you provide a script for qualified InternLM-XComposer2-4KHD-7B (int4) ?@myownskyW7

@Cloopen-ReLiNK
Copy link

OOM error on one 48G memory card.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants