AssertionError: zero stage 1 requires an optimizer #987

yonglianglan · 2023-07-04T14:05:18Z

An error occurred when using the evaluation code。
the command is:
python ./deepy.py evaluate.py xxxx.yml --eval_tasks piqa

The training mode is multi-machine mode and the evaluation mode is single-machine mode.

Has anyone had a similar issue? thanks!

The text was updated successfully, but these errors were encountered:

StellaAthena · 2023-07-05T17:44:40Z

This is a known issue that is awkward to handle. Our current recommendation is to set ZeRO stage 0 when calling the evaluation script. We are working on integrating DeepSpeed Inference which will solve this issue and substantially accelerate inference tasks as well.

vsabavat · 2023-11-14T17:58:57Z

Is this bug resolved? How do we pass the or set the ZeRo stage 1? I also see the same error during inference.

python ./deepy.py generate.py -d configs 125M local_setup text_generation

  File "generate.py", line 91, in <module>
    main()
  File "generate.py", line 33, in main
    model, neox_args = setup_for_inference_or_eval(use_cache=True)
  File "/localhome/local-vsabavat/ai/training/gpt-neox/megatron/utils.py", line 448, in setup_for_inference_or_eval
    model, _, _ = setup_model_and_optimizer(
  File "/localhome/local-vsabavat/ai/training/gpt-neox/megatron/training.py", line 647, in setup_model_and_optimizer
    model, optimizer, _, lr_scheduler = deepspeed.initialize(
  File "/localhome/local-vsabavat/.local/lib/python3.8/site-packages/deepspeed/__init__.py", line 186, in initialize
    engine = PipelineEngine(args=args,
  File "/localhome/local-vsabavat/.local/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 68, in __init__
    super().__init__(*super_args, **super_kwargs)
  File "/localhome/local-vsabavat/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 309, in __init__
    self.optimizer = self._configure_zero_optimizer(optimizer=None)
  File "/localhome/local-vsabavat/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1468, in _configure_zero_optimizer
    assert not isinstance(optimizer, DummyOptim), "zero stage {} requires an optimizer".format(zero_stage)
AssertionError: zero stage 1 requires an optimizer```

AIproj · 2023-11-27T05:30:38Z

@vsabavat In one of your yml config files you should have something that looks like

  "zero_optimization": {
    "stage": 1,
    "allgather_partitions": true,
    "allgather_bucket_size": 1260000000,
    "overlap_comm": true,
    "reduce_scatter": true,
    "reduce_bucket_size": 1260000000,
    "contiguous_gradients": true,
    "cpu_offload": false
  },

In my example, the stage is set to 1.

yonglianglan added the bug Something isn't working label Jul 4, 2023

StellaAthena added good first issue Good for newcomers help wanted This issue needs assistance labels Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: zero stage 1 requires an optimizer #987

AssertionError: zero stage 1 requires an optimizer #987

yonglianglan commented Jul 4, 2023

StellaAthena commented Jul 5, 2023

vsabavat commented Nov 14, 2023 •

edited

AIproj commented Nov 27, 2023

AssertionError: zero stage 1 requires an optimizer #987

AssertionError: zero stage 1 requires an optimizer #987

Comments

yonglianglan commented Jul 4, 2023

StellaAthena commented Jul 5, 2023

vsabavat commented Nov 14, 2023 • edited

AIproj commented Nov 27, 2023

vsabavat commented Nov 14, 2023 •

edited