Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Validation loss & PPL keep going up #787

Open
zhentingqi opened this issue Apr 20, 2024 · 0 comments
Open

[QUESTION] Validation loss & PPL keep going up #787

zhentingqi opened this issue Apr 20, 2024 · 0 comments

Comments

@zhentingqi
Copy link

Hi, so I was training 345m GPT2 using your example scripts examples/pretrain_gpt.sh. The validation loss and PPL, however, keep going up, while the training loss decreases as expected.
image
My hyperparameters are shown here:

GPT_ARGS="
    --num-layers 24 \
    --hidden-size 1024 \
    --num-attention-heads 16 \
    --seq-length 1024 \
    --max-position-embeddings 1024 \
    --micro-batch-size 2 \
    --global-batch-size 4 \
    --lr 3.0e-4 \
    --train-iters 300000 \
    --lr-decay-iters 320000 \
    --lr-decay-style cosine \
    --min-lr 1.0e-5 \
    --weight-decay 1e-2 \
    --lr-warmup-fraction .01 \
    --clip-grad 1.0 \
    --fp16
"

DATA_ARGS="
    --data-path $DATA_PATH \
    --vocab-file $VOCAB_FILE \
    --merge-file $MERGE_FILE \
    --data-impl mmap \
    --split 700,200,100
"

OUTPUT_ARGS="
    --log-interval 100 \
    --save-interval 50000 \
    --eval-interval 1000 \
    --eval-iters 10
"

Can anyone please tell me what is wrong? Should not the PPL decreases? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant