Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

缺省Lora训练显存消耗 60G #253

Open
is opened this issue Jun 18, 2023 · 0 comments
Open

缺省Lora训练显存消耗 60G #253

is opened this issue Jun 18, 2023 · 0 comments

Comments

@is
Copy link

is commented Jun 18, 2023

最新的dev分支6a42db4c1fdffee9ccc8f7d91775c5b4112738f6

使用缺省的配置,lora,没有开quantization, 没有开deepspeed

# 模块配置, 默认启用lora
enable_deepspeed = False
enable_ptv2 = False
enable_lora = True
enable_int8 = False # qlora int8
enable_int4 = False # qlora int4
INFO: 
  | Name                                  | Type      | Params
--------------------------------------------------------------------
0 | _TransformerLightningModule__backbone | LoraModel | 6.2 B 
--------------------------------------------------------------------
3.7 M     Trainable params
6.2 B     Non-trainable params
6.2 B     Total params
24,704.811Total estimated model params size (MB)

直接跑train.py, 显存消耗50-60G,
在V100S单卡上OOM,这个情况合理吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant