Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save sharded checkpoints #645

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Save sharded checkpoints #645

wants to merge 5 commits into from

Conversation

LaaZa
Copy link
Contributor

@LaaZa LaaZa commented Apr 19, 2024

Continuation of #364

Mainly brought up to date without changes.
Will add test(s) and changes as need. @fxmarty would like input on this, how should we test this?

Testing with larger, real models would be appreciated.

in save_pretrained/quantized use max_shard_size="1337MB", default is 10GB and sharding will automatically occur if this is reached, to avoid sharding just set this value really high.

@markoarnauto
Copy link

Testing with larger, real models would be appreciated.

Tried Meta-Llama-3-70B-Instruct, worked like a charm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants