Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

question: [Q8Bert experiment Setting] #219

Open
daumkh402 opened this issue Apr 9, 2021 · 2 comments
Open

question: [Q8Bert experiment Setting] #219

daumkh402 opened this issue Apr 9, 2021 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@daumkh402
Copy link

daumkh402 commented Apr 9, 2021

Hello, I read the Q8Bert paper and have tried to reproduce the experiment results.
But, on some GLUE tasks ( e.g cola, mrpc ), the differences between the fp32 results and quantized ones are much larger than the differences reported in the paper.
I tried sweeping initial learning rate but still the result was still far from the reported results.

image

So, I want to ask you if the experiment on Q8bert was done with default parameters set inside nlp-architect code as below.

image

If not, could you tell me the experiment setting.

@daumkh402 daumkh402 added the question Further information is requested label Apr 9, 2021
@daumkh402 daumkh402 changed the title question: [question topic] question: [Q8Bert experiment Setting] Apr 9, 2021
@ofirzaf
Copy link
Collaborator

ofirzaf commented Apr 19, 2021

Hi,

What version of nlp_architect and transformers did you use to run the experiments?

Please note that both MRPC and CoLa tasks are known to be unstable in their results.

The experiments in the paper were done using a very early version of HF/transformers, here are the official results from HF relevant at the time of writing the paper: https://huggingface.co/transformers/v1.0.0/examples.html#glue-results-on-dev-set

@daumkh402
Copy link
Author

Hi,
The version of nlp_architect is 0.5.5.
The version of transformers is 2.4.1.

Thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants