Add generating batch size #592

zhuzilin · 2022-03-18T05:39:26Z

The origin implementation was generating samples one by one, which results in low gpu utility for small model. This commit adds a generating-batch-size config to enable batch generation.

Thank you for your time on reviewing this PR :)

The origin implementation was generating samples one by one, which results in low gpu utility for small model. This commit adds a generating-batch-size config to enable batch generation.

StellaAthena

I think a better solution is to drop the word train from the name of the existing argument and use the existing argument for generation. Having two seems needlessly confusing.

zhuzilin · 2022-03-23T07:16:20Z

I think a better solution is to drop the word train from the name of the existing argument and use the existing argument for generation. Having two seems needlessly confusing.

@StellaAthena I found that the train_batch_size arg relates to the train_micro_batch_size_per_gpu, which has something to do with gradient accumulation. And there is a batch_size arg used solely with LM Evaluation Harness:

gpt-neox/eval_tasks/eval_adapter.py

Lines 70 to 72 in 1e346ef

    
           self.batch_size = batch_size or ( 
        
               neox_args.batch_size * self.dp_world_size 
        
           )  # default batch size to bs per gpu * dp size

Maybe using the batch_size arg is better?

StellaAthena · 2022-03-23T13:07:05Z

My suggestion is to rename train_micro_batch_size_per_gpu to micro_batch_size_per_gpu and allow it to fill both the train and the inference roles.

zhuzilin · 2022-03-23T14:10:22Z

@StellaAthena
Just a double check, you mean rename train_batch_size to batch_size, train_micro_batch_size_per_gpu to micro_batch_size_per_gpu and use micro_batch_size_per_gpu for inference/generation, right?

StellaAthena · 2022-03-23T18:47:13Z

@StellaAthena Just a double check, you mean rename train_batch_size to batch_size, train_micro_batch_size_per_gpu to micro_batch_size_per_gpu and use micro_batch_size_per_gpu for inference/generation, right?

No, I mean just the second rename. The training batch size and the eval batch size for a fixed microbatch size per gpu are typically different due to gradient accumulation.

zhuzilin · 2022-03-24T10:26:27Z

@StellaAthena Updated.

zhuzilin · 2022-03-27T15:03:12Z

@StellaAthena Hi, could you help me to fix the error raised in CI? Thank you!

CLAassistant · 2023-04-23T02:53:10Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Add generating batch size

125d95e

The origin implementation was generating samples one by one, which results in low gpu utility for small model. This commit adds a generating-batch-size config to enable batch generation.

zhuzilin requested a review from a team as a code owner March 18, 2022 05:39

zhuzilin requested review from kingoflolz and leogao2 March 18, 2022 05:39

StellaAthena linked an issue Mar 20, 2022 that may be closed by this pull request

generate.py not utilizing GPU in full #476

Closed

StellaAthena requested changes Mar 22, 2022

View reviewed changes

Rename train_micro_batch_size_per_gpu to micro_batch_size_per_gpu

65e2bbb

zhuzilin requested a review from StellaAthena March 24, 2022 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generating batch size #592

Add generating batch size #592

zhuzilin commented Mar 18, 2022

StellaAthena left a comment

zhuzilin commented Mar 23, 2022

StellaAthena commented Mar 23, 2022 •

edited

zhuzilin commented Mar 23, 2022

StellaAthena commented Mar 23, 2022

zhuzilin commented Mar 24, 2022

zhuzilin commented Mar 27, 2022

CLAassistant commented Apr 23, 2023

Add generating batch size #592

Are you sure you want to change the base?

Add generating batch size #592

Conversation

zhuzilin commented Mar 18, 2022

StellaAthena left a comment

Choose a reason for hiding this comment

zhuzilin commented Mar 23, 2022

StellaAthena commented Mar 23, 2022 • edited

zhuzilin commented Mar 23, 2022

StellaAthena commented Mar 23, 2022

zhuzilin commented Mar 24, 2022

zhuzilin commented Mar 27, 2022

CLAassistant commented Apr 23, 2023

StellaAthena commented Mar 23, 2022 •

edited