New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use_reentrant=False can't be set properly #30749
Comments
Hi there! It seems like you are encountering a warning related to the In your code snippet, you have set To address this warning and ensure that checkpointing_args = {"use_reentrant": False}
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=data_collator,
gradient_checkpointing=True,
gradient_checkpointing_kwargs=checkpointing_args
) By explicitly passing If you have any further questions or if the issue persists, feel free to ask for more assistance! |
TypeError: Trainer.init() got an unexpected keyword argument 'gradient_checkpointing_kwargs' It doesn't work. BTW, are you a model? |
@cw235 There are a series of issues you've commented on e.g. here, here and here in which the issue has clearly been fed through to a chat model e.g. ChatGPT to produce an output which does not answer or address the question. Please refrain from doing this, it is both unhelpful to the person reporting the issue and isn't scalable behaviour: what would happen if everyone started doing this? All of these will be marked as spam so as to hide them and make the issues navigable. If you keep doing this, the account @cw235 will be reported. |
I couldn't recreate this warning. Can you try either:
And let us know if you get this warning? Thanks! |
Thank you. Is it possible to be related to specific models with the argument |
System Info
transformers==4.40.1
deepspeed==0.14.2
torch==2.2.1
Who can help?
@ArthurZucker Hello, I used the tranformers' trainer with deepspeed to train the decoder-only model. As a common solution to reduce memory, I enabled gradient checkpointing and set use_reentrance to False in my code:
When I printed the training_args, it shows propoerly:
The training_args is properly passed to Trainer with:
However, when training starts, I was still always warned with the message:
I don't understand the reason leading to the message but it seems that
use_reentrant
is not properly set to take effect.Could anyone please help me take a look at the problem?
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
The some configs of deepspeed are as follows:
Expected behavior
No warning message
The text was updated successfully, but these errors were encountered: