We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No response
CPU 的 qwen-cpp 如何封装为一个 http 服务?
无
The text was updated successfully, but these errors were encountered:
如果是要HTTP的API服务的话,qwen-cpp有python binding,openai_api.py的model更换下也许可以。 如果是要HTTP的Web服务的话,web_demo.py应该也是要替换模型创建的部分。
对C实现的模型有需求,建议关注llama.cpp,现在也支持Qwen了,那个的生态也更丰富些。
Sorry, something went wrong.
@jklj077 麻烦问下。怎么让openai_api.py支持并发请求?
@sheiy 本repo中的openai_pai.py支持不了并发哈。如果有并发的需要,建议使用FastChat+vLLM,也可以提供OpenAI API类似的接口。
@jklj077 多谢
simonJJJ
No branches or pull requests
起始日期 | Start Date
No response
实现PR | Implementation PR
CPU 的 qwen-cpp 如何封装为一个 http 服务?
相关Issues | Reference Issues
无
摘要 | Summary
无
基本示例 | Basic Example
无
缺陷 | Drawbacks
无
未解决问题 | Unresolved questions
No response
The text was updated successfully, but these errors were encountered: