Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TernaryBERT如何实现模型size降低的 #238

Open
saggitarxm opened this issue May 16, 2023 · 0 comments
Open

TernaryBERT如何实现模型size降低的 #238

saggitarxm opened this issue May 16, 2023 · 0 comments

Comments

@saggitarxm
Copy link

你好,看了您的论文和代码,word_embedding, q,k,v等weight采用了TWN的方式进行量化,但是TWN的量化方法其实是对weight的取值进行量化,weight的size还是32bit,并不是2bit,保存的模型size和原始模型是一样大的,推理的时间也不能降低,请问是哪里理解错了呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant