Replication of Instructor #42

aamir-s18 · 2023-05-28T16:12:41Z

Hey, we are currently trying to replicate the Instructor model. Issue #14 already asks this, but please report the exact training setup for the models.

Also, I am interested in the loss of your model. I didn't get your reported results by running the model for 100k steps. It could be more evident to me how you used just 40k steps for the model while you mentioned in your paper that you trained it on the MEDI dataset.

I would appreciate your help here :)

hongjin-su · 2023-05-29T08:08:57Z

Hi, Thanks a lot for your interest in the INSTRUCTOR model!

As the MEDI dataset contains a large volume of data, there is no need to complete training on all of them. In fact, as some sources in MEDI may contain similar data, there may be overfitting problem if the training goes up to 100k steps.

For your reference, we use the following command in the training:

python train.py --model_name_or_path sentence-transformers/gtr-t5-large --output_dir {output_directory} --cache_dir {cache_directory} --max_source_length 512 --num_train_epochs 10 --save_steps 500 --cl_temperature 0.01 --warmup_ratio 0.1 --learning_rate 2e-5 --overwrite_output_dir

Feel free to add any further questions or comments!

aamir-s18 · 2023-05-29T09:44:44Z

Hey,

But for your published model, what data exactly did you train it?

Also, loss and batch size are missing (in your report). If you say 40k steps, for example, the size of the samples differs a lot based on the batch size. It would be great if you could report the exact training setup to replicate and verify your work.

Thanks!

hongjin-su · 2023-06-05T07:26:40Z

Hi, we train the model on the MEDI data, which you can download from https://drive.google.com/file/d/1vZ5c2oJNonGOvXzppNg5mHz24O6jcc52/view?usp=sharing. In our setting, we only use the batch size 4

aamir-s18 · 2023-06-05T07:47:02Z

Hey,

could you please report the loss as well. So it means that you only train it on 4 * 40k data samples of the MEDI dataset and for 1 epoch?

hongjin-su · 2023-06-05T08:14:03Z

Hi,

The loss of the model is in general between .4 and .5 for all the three models.
Yes, we provide abundant resources in the MEDI data, and some of them may be similar. Therefore, there is no need to finish training the model on all the data.

yangjianxin1 · 2023-06-05T17:04:32Z

the batch size of 4 is very small for contrastive learning, maybe it should be larger, such as 32 or 64?

hongjin-su · 2023-06-06T02:18:27Z

Yes, the model is probably better with a larger training batch size. However, due to the limit of the machine, we may leave the further scaling to future work!

iavinasoss · 2023-06-28T09:56:51Z

Hi, we train the model on the MEDI data, which you can download from https://drive.google.com/file/d/1vZ5c2oJNonGOvXzppNg5mHz24O6jcc52/view?usp=sharing. In our setting, we only use the batch size 4

Hey, i had a small question.

Where can we change the batch_size ? I can't find any argument for it.

Thanks

hongjin-su · 2023-07-02T13:01:22Z

Hi, you may change the batch size via the argument per_device_train_batch_size.

iavinasoss · 2023-07-02T14:04:19Z

Got it, Thank you for the help.

YihanWang617 · 2023-07-19T20:31:48Z

Hi, I am also trying to replicate your work. May I know how many GPUs do you use in training?

hongjin-su · 2023-07-22T10:03:59Z

Hi, we use only a single GPU in the training.

EliverQ · 2023-07-24T09:23:13Z

Hey, we are currently trying to replicate the Instructor model. Issue #14 already asks this, but please report the exact training setup for the models.

Also, I am interested in the loss of your model. I didn't get your reported results by running the model for 100k steps. It could be more evident to me how you used just 40k steps for the model while you mentioned in your paper that you trained it on the MEDI dataset.

I would appreciate your help here :)

Hey! I also encountered issues with reproducing the results. Have you successfully replicated the INSTRUCTOR's performance? Even though I used the exact same settings, I couldn't achieve success. If you have succeeded, could you please give me some advice? Thank you very much.

aamir-s18 · 2023-08-15T09:31:43Z

@EliverQ could you hit me up through email aamir.shakir [at] epfl.ch

YihanWang617 · 2023-08-27T17:19:20Z

Hi, I have the same issue and cannot replica the results reported on the paper. Could the authors provide the exact training commands of the checkpoints released?

EliverQ mentioned this issue Jul 20, 2023

Reproduction of training INSTRUCTOR #66

Open

qiuwenbogdut mentioned this issue Sep 13, 2023

cannot reproduce the results of INSTRUCTOR. #84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replication of Instructor #42

Replication of Instructor #42

aamir-s18 commented May 28, 2023

hongjin-su commented May 29, 2023

aamir-s18 commented May 29, 2023

hongjin-su commented Jun 5, 2023

aamir-s18 commented Jun 5, 2023

hongjin-su commented Jun 5, 2023

yangjianxin1 commented Jun 5, 2023

hongjin-su commented Jun 6, 2023

iavinasoss commented Jun 28, 2023 •

edited

hongjin-su commented Jul 2, 2023

iavinasoss commented Jul 2, 2023

YihanWang617 commented Jul 19, 2023 •

edited

hongjin-su commented Jul 22, 2023

EliverQ commented Jul 24, 2023

aamir-s18 commented Aug 15, 2023

YihanWang617 commented Aug 27, 2023 •

edited

Replication of Instructor #42

Replication of Instructor #42

Comments

aamir-s18 commented May 28, 2023

hongjin-su commented May 29, 2023

aamir-s18 commented May 29, 2023

hongjin-su commented Jun 5, 2023

aamir-s18 commented Jun 5, 2023

hongjin-su commented Jun 5, 2023

yangjianxin1 commented Jun 5, 2023

hongjin-su commented Jun 6, 2023

iavinasoss commented Jun 28, 2023 • edited

hongjin-su commented Jul 2, 2023

iavinasoss commented Jul 2, 2023

YihanWang617 commented Jul 19, 2023 • edited

hongjin-su commented Jul 22, 2023

EliverQ commented Jul 24, 2023

aamir-s18 commented Aug 15, 2023

YihanWang617 commented Aug 27, 2023 • edited

iavinasoss commented Jun 28, 2023 •

edited

YihanWang617 commented Jul 19, 2023 •

edited

YihanWang617 commented Aug 27, 2023 •

edited