Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mT0-xxl finetuning #19

Open
sh0tcall3r opened this issue May 15, 2023 · 6 comments
Open

mT0-xxl finetuning #19

sh0tcall3r opened this issue May 15, 2023 · 6 comments

Comments

@sh0tcall3r
Copy link

Hello!
Thanks a lot for your job! I'm using mT0-xxl for question answering task, however it performs with not so high quality I expected it to do. So I'm trying to finetune the model a little bit. If I understood correctly, first of all I should get checkpoint and gin file for the model I want to finetune. Could you please share with these?
And is it possible to finetune it with torch or tf is the only way?

@Muennighoff
Copy link
Collaborator

Hey there are some more details on mT0 fine-tuning here: #12
The config is here: #6 (comment)

@sh0tcall3r
Copy link
Author

Thanks for reply! Will try mentioned config.

@sh0tcall3r
Copy link
Author

sh0tcall3r commented May 16, 2023

Hey @Muennighoff , It's seems that I still can't get a couple of things. Would be very appreciate If you could give me a hand here.
Well, I need to finetune your model mT0-xxl (not the initial T5X-xxl), so accordingly to the manual https://github.com/google-research/t5x/blob/main/docs/usage/finetune.md I need 3 components (excluded SeqIO Task, which is clear as for now) to proceed:

  1. Checkpoint -- Could you please share with mT0-xxl checkpoint? In the manual all used checkpoints are the TensorFlow weights etc, but on the HuggingFace there are only PyTorch weights. So I do need either mT0-xxl checkpoint in TensorFlow or finetune the model in PyTorch (is it even possible?)
  2. Gin file for the model to finetune (mT0-xxl in the case) -- Could I use the default one like https://github.com/google-research/t5x/blob/main/t5x/examples/t5/mt5/xxl.gin?
  3. Gin file configuring finetuning process -- I write it by my own based on https://github.com/google-research/t5x/blob/main/t5x/configs/runs/finetune.gin with some overrides, right?
    Please, correct me if I wrong in some points.

@Muennighoff
Copy link
Collaborator

There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x
I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob does

For 2. & 3., yes I think so

@adarob
Copy link

adarob commented May 18, 2023 via email

@sh0tcall3r
Copy link
Author

Thanks a lot, guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants