Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDOP - Fine tuning with bad metrics #420

Open
arvisioncode opened this issue May 7, 2024 · 2 comments
Open

UDOP - Fine tuning with bad metrics #420

arvisioncode opened this issue May 7, 2024 · 2 comments

Comments

@arvisioncode
Copy link

arvisioncode commented May 7, 2024

I have obtained a finetune model in funds following the steps in your notebook, the only change introduced is the model base: "microsoft/udop-large-512-300k"

Train configuration:

training_args = TrainingArguments(output_dir="test",
                                  max_steps=3000,
                                  warmup_ratio=0.1,
                                  per_device_train_batch_size=1,
                                  per_device_eval_batch_size=1,
                                  gradient_accumulation_steps=8,
                                  eval_accumulation_steps=8,
                                  learning_rate=5e-5,
                                  evaluation_strategy="steps",
                                  eval_steps=100,
                                  load_best_model_at_end=True,
                                  metric_for_best_model="f1")

The results that I obtained in the end and the generated model have the following characteristics:

This model is a fine-tuned version of on FUNSD dataset. It achieves the following results on the evaluation set:

Loss: 1.3328
Precision: 0.8664
Recall: 0.8775
F1: 0.8719
Accuracy: 0.8085

However, in the UDOP paper it is specified that the metrics of this model trained in FUNSD should be:
image

How can I reach those precision values?

Do you advise me to change any parameters of the model? Should I increase the number of epochs a lot?

Thank you so much

@NielsRogge
Copy link
Owner

Hi,

Thanks for your interset in UDOP! Note that the metrics they show in the paper are for UdopForConditionalGeneration, not for UdopEncoderModel. Hence they leverage the encoder-decoder (generative) model for fine-tuning on FUNSD, which can also be found in my demo notebook.

@arvisioncode
Copy link
Author

Hi @NielsRogge and thank you very much for your work!

I have also tried to fine-tune the microsoft/udop-large-512-300k model following the steps in demo notebook, using the default configuration. However, the output results have similar metrics around 0.8 accuracy.

Can you give us any advice to improve these workouts? Wouldn't it be possible to achieve the results we saw in the paper with this notebook?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants