UDOP - Fine tuning with bad metrics #420

arvisioncode · 2024-05-07T11:55:16Z

I have obtained a finetune model in funds following the steps in your notebook, the only change introduced is the model base: "microsoft/udop-large-512-300k"

Train configuration:

training_args = TrainingArguments(output_dir="test",
                                  max_steps=3000,
                                  warmup_ratio=0.1,
                                  per_device_train_batch_size=1,
                                  per_device_eval_batch_size=1,
                                  gradient_accumulation_steps=8,
                                  eval_accumulation_steps=8,
                                  learning_rate=5e-5,
                                  evaluation_strategy="steps",
                                  eval_steps=100,
                                  load_best_model_at_end=True,
                                  metric_for_best_model="f1")

The results that I obtained in the end and the generated model have the following characteristics:

This model is a fine-tuned version of on FUNSD dataset. It achieves the following results on the evaluation set:

Loss: 1.3328
Precision: 0.8664
Recall: 0.8775
F1: 0.8719
Accuracy: 0.8085

However, in the UDOP paper it is specified that the metrics of this model trained in FUNSD should be:

How can I reach those precision values?

Do you advise me to change any parameters of the model? Should I increase the number of epochs a lot?

Thank you so much

The text was updated successfully, but these errors were encountered:

NielsRogge · 2024-05-07T12:09:59Z

Hi,

Thanks for your interset in UDOP! Note that the metrics they show in the paper are for UdopForConditionalGeneration, not for UdopEncoderModel. Hence they leverage the encoder-decoder (generative) model for fine-tuning on FUNSD, which can also be found in my demo notebook.

arvisioncode · 2024-05-09T09:31:12Z

Hi @NielsRogge and thank you very much for your work!

I have also tried to fine-tune the microsoft/udop-large-512-300k model following the steps in demo notebook, using the default configuration. However, the output results have similar metrics around 0.8 accuracy.

Can you give us any advice to improve these workouts? Wouldn't it be possible to achieve the results we saw in the paper with this notebook?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UDOP - Fine tuning with bad metrics #420

UDOP - Fine tuning with bad metrics #420

arvisioncode commented May 7, 2024 •

edited

NielsRogge commented May 7, 2024

arvisioncode commented May 9, 2024

UDOP - Fine tuning with bad metrics #420

UDOP - Fine tuning with bad metrics #420

Comments

arvisioncode commented May 7, 2024 • edited

NielsRogge commented May 7, 2024

arvisioncode commented May 9, 2024

arvisioncode commented May 7, 2024 •

edited