ddp ERROR #71

liyingjie1991 · 2024-01-05T13:26:47Z

hi, when I run the training code, I met the following error. Can you give me some advice?
` File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(*args, **kwargs)
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 675, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: compile_fn raised TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'

Set torch._dynamo.config.verbose=True for more information

You can suppress this exception and fall back to eager by setting:
torch._dynamo.config.suppress_errors = True

Traceback (most recent call last):
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/output_graph.py", line 670, in call_user_compiler
compiled_fn = compiler_fn(gm, self.fake_example_inputs())
File "/ssd5/exec/liyj/miniconda3/envs/seamless/lib/python3.9/site-packages/torch/_dynamo/backends/distributed.py", line 203, in compile_fn
return self.backend_compile_fn(gm, example_inputs)
TypeError: _convert_frame_assert() missing 1 required positional argument: 'hooks'`

Version:
torch: '2.0.1+cu117'

sanchit-gandhi · 2024-01-17T16:44:49Z

Hey @liyingjie1991 - are you using torch compile while training? I personally didn't test training with this configuration, but would expect it to work for training as expected (static shapes). The generate step during evaluation probably won't work, since we use a dynamic k/v cache in Transformers, and so have dynamic shapes. If you're using torch compile, could you try disabling it for evaluation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ddp ERROR #71

ddp ERROR #71

liyingjie1991 commented Jan 5, 2024

sanchit-gandhi commented Jan 17, 2024

ddp ERROR #71

ddp ERROR #71

Comments

liyingjie1991 commented Jan 5, 2024

sanchit-gandhi commented Jan 17, 2024