Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练需要什么样的显卡配置 #36

Open
nevermorewish opened this issue Apr 11, 2024 · 4 comments
Open

训练需要什么样的显卡配置 #36

nevermorewish opened this issue Apr 11, 2024 · 4 comments

Comments

@nevermorewish
Copy link

nevermorewish commented Apr 11, 2024

论文中说训练 我们在128个NVIDIA(40G)A100 GPU上以批次大小1024(每次迭代1024个不同形状)训练LRM 30个周期,大约需要3天时间完成。每个周期包含Objaverse的一份渲染图像数据和MV。
论文里使用的数据集allenai/objaverse 有好几个T

没有这么多a100, 一张a100训练少点数据可以么?

@nevermorewish nevermorewish changed the title 训练需要什么样的显卡配置,a6000够么 训练需要什么样的显卡配置 Apr 11, 2024
@ZexinHe
Copy link
Collaborator

ZexinHe commented Apr 15, 2024

您好,一张A100着实可能有些少了,您可以观察一下少数据情况下的收敛情况,但估计可能不太乐观。

@juanfraherrero
Copy link

I was trying to run it on my rtx 1050 for notebooks.
Imposible to run even with 1 batch, and all configs to the lower as posible. Always cuda out of memory hahaha!

@hayoung-jeremy
Copy link

Hi @juanfraherrero , since I'm very new to AI, not sure how to properly prepare data and run training.
I've posted my question on this issue.
Could you please check it when it possible?
Thank you in advance!

@ZexinHe
Copy link
Collaborator

ZexinHe commented May 6, 2024

Hi @juanfraherrero,

You can try decreasing the frame_size here and see if it still throws OOM error. If it still doesn't work, then I guess RTX1050 is not enough to do inference :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants