Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occurring error when I add new tokens to the tokenizer. #237

Open
charlesCXK opened this issue Mar 12, 2024 · 6 comments
Open

Occurring error when I add new tokens to the tokenizer. #237

charlesCXK opened this issue Mar 12, 2024 · 6 comments
Labels
currently fixing Am fixing now!

Comments

@charlesCXK
Copy link

charlesCXK commented Mar 12, 2024

Hi,
I want to add new tokens to the tokenizer through:

tokenizer.add_tokens("<NEW1>", special_tokens=True) 
tokenizer.add_tokens("<NEW2>", special_tokens=True) 
model.resize_token_embeddings(len(tokenizer))
model.config.vocab_size = len(tokenizer)

Then I save the model as LoRA adapters through:

model.save_pretrained(save_path)

When I load the model, the error occurs:

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = save_path, # YOUR MODEL YOU USED FOR TRAINING
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
        size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32002, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

It seems that the saved checkpoint does not match the pre-defined model architecture (with 32000-d output). What should I do to solve this issue?

Thanks,

@danielhanchen
Copy link
Contributor

danielhanchen commented Mar 13, 2024

@charlesCXK Oh I think you'll have to add modules_to_save ie https://github.com/unslothai/unsloth/wiki#finetuning-the-lm_head-and-embed_tokens-matrices

I haven't yet fixed some parts, so hopefully I'll fix this by today! Sorry on the delay!

@danielhanchen danielhanchen added the currently fixing Am fixing now! label Mar 13, 2024
@charlesCXK
Copy link
Author

charlesCXK commented Mar 17, 2024

@danielhanchen
Dear author,

Thanks for your reply! I think the core problem is not related to add modules_to_save. We can see that the model is already saved ("copying a param with shape torch.Size([32002, 4096]) from checkpoint"). I am wondering how can I load the saved model using FastLanguageModel.from_pretrained. I want to use the saved model (with new vocabulary) for inference.

@charlesCXK
Copy link
Author

charlesCXK commented Mar 22, 2024

Dear author,
I have fixed the issue and create a pull request: #272.
Now we can successfully load checkpoints with newly added special tokens through new_token_num argument in FastLanguageModel.from_pretrained function.
@danielhanchen

@danielhanchen
Copy link
Contributor

@charlesCXK Oh thanks!! So sorry again on the issue! I'll take a look a your PR - thanks so much again!

@chtmp223
Copy link

chtmp223 commented Apr 4, 2024

Hi, bumping this up again! I added a new token to the tokenizer. Now I want to load my LoRA checkpoint using from_pretrained, but I got the same error:

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
	size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

@danielhanchen Would you mind reviewing charlesCXK's PR?

@danielhanchen
Copy link
Contributor

@charlesCXK @chtmp223 Whoops I actually totally missed this, but now using resize_model_vocab = True in FastLanguageModel.from_pretrained(...) should hopefully fix the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
currently fixing Am fixing now!
Projects
None yet
Development

No branches or pull requests

3 participants