Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

中文的数据不够52k,同时Input也大幅减少,这个是为什么啊 #33

Open
gaijigoumeiren opened this issue Sep 9, 2023 · 0 comments

Comments

@gaijigoumeiren
Copy link

英文的数据量是52002,有input的数据量是20679,大概占比是39.7%
中文的数据量是48818,有input的数据量是6808,大概占比是13.9%

  1. 为什么中文的数据会变少啊?
  2. 中文中的input我看了一些,有的是合并进prompt了,有的是直接删掉了,这个背后有什么考虑啊
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant