7B版本无法多卡运行 #303

ybshaw · 2024-05-06T07:41:45Z

使用官方提供的7B版本，单卡24G内存的RTX上无法运行，报OOM错误，指定卡号后无法生效，依然还是只占用第0卡，要怎么推理才可以正常运行

import torch
from transformers import AutoModel, AutoTokenizer

torch.set_grad_enabled(False)
ckpt_path='/home/my/.cache/modelscope/hub/Shanghai_AI_Laboratory/internlm-xcomposer2-vl-7b'


# init model and tokenizer
model = AutoModel.from_pretrained(ckpt_path, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)

text = '<ImageHere>仔细描述这张图'
image='/home/my/cat.jpg'
with torch.cuda.amp.autocast():
	response, _ = model.chat(tokenizer, query=text, image=image, history=[], do_sample=False)
print(response)

报错：OOM错误

代码中指定所有卡号（机器信息：4卡，每张24G内存）

import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3'

import torch
from transformers import AutoModel, AutoTokenizer

torch.set_grad_enabled(False)
ckpt_path='/home/my/.cache/modelscope/hub/Shanghai_AI_Laboratory/internlm-xcomposer2-vl-7b'


# init model and tokenizer
model = AutoModel.from_pretrained(ckpt_path, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)

text = '<ImageHere>仔细描述这张图'
image='/home/my/cat.jpg'
with torch.cuda.amp.autocast():
	response, _ = model.chat(tokenizer, query=text, image=image, history=[], do_sample=False)
print(response)

还是一样的错误，查看nvidia-smi发现实际还是跑在一张卡上，没有分布到其余卡上

XueFengHF · 2024-05-11T09:57:10Z

同样的问题，4卡3090,example只能单卡，finetune单卡爆显存，多卡报错ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 2 (pid: 15250) of binary: /opt/conda/envs/internlm/bin/python

mm-assistant bot assigned myownskyW7 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

7B版本无法多卡运行 #303

7B版本无法多卡运行 #303

ybshaw commented May 6, 2024

XueFengHF commented May 11, 2024

7B版本无法多卡运行 #303

7B版本无法多卡运行 #303

Comments

ybshaw commented May 6, 2024

代码中指定所有卡号（机器信息：4卡，每张24G内存）

XueFengHF commented May 11, 2024