We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
模型中max_seq_length指的应该是模型能处理的最大token数,我想问下,这个模型中的token跟汉字字符是一个大概什么样比例的换算关系,我在一个博客上看到在text2vec上是1token约等于1.5个汉字,请问这个结论对吗?
The text was updated successfully, but these errors were encountered:
是bert的token编码方式,1个token是1个汉字。
Sorry, something went wrong.
No branches or pull requests
模型中max_seq_length指的应该是模型能处理的最大token数,我想问下,这个模型中的token跟汉字字符是一个大概什么样比例的换算关系,我在一个博客上看到在text2vec上是1token约等于1.5个汉字,请问这个结论对吗?
The text was updated successfully, but these errors were encountered: