Fix custom token in train.py #246

naufalso · 2023-04-26T01:18:40Z

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that tokenizer.eos_token, tokenizer.bos_token, and tokenizer.unk_token are all '' (empty string).

Since '' (empty string) is not equal to None, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached. I tried to debug the code and found that `tokenizer.eos_token`, `tokenizer.bos_token`, and `tokenizer.unk_token` are all `'' (empty string).` Since `'' (empty string)` is not equal to `None`, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes. I have tested that after the training using the modified code, the model can output EOS token correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix custom token in train.py #246

Fix custom token in train.py #246

naufalso commented Apr 26, 2023

Fix custom token in train.py #246

Are you sure you want to change the base?

Fix custom token in train.py #246

Conversation

naufalso commented Apr 26, 2023