Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 [REQUEST] - CPU 的 qwen-cpp 如何封装为一个 http 服务? #65

Open
micronetboy opened this issue Dec 14, 2023 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@micronetboy
Copy link

起始日期 | Start Date

No response

实现PR | Implementation PR

CPU 的 qwen-cpp 如何封装为一个 http 服务?

相关Issues | Reference Issues

摘要 | Summary

基本示例 | Basic Example

缺陷 | Drawbacks

未解决问题 | Unresolved questions

No response

@micronetboy micronetboy added the question Further information is requested label Dec 14, 2023
@jklj077
Copy link
Collaborator

jklj077 commented Dec 14, 2023

如果是要HTTP的API服务的话,qwen-cpp有python binding,openai_api.py的model更换下也许可以。
如果是要HTTP的Web服务的话,web_demo.py应该也是要替换模型创建的部分。

对C实现的模型有需求,建议关注llama.cpp,现在也支持Qwen了,那个的生态也更丰富些。

@sheiy
Copy link

sheiy commented Dec 19, 2023

@jklj077 麻烦问下。怎么让openai_api.py支持并发请求?

@jklj077
Copy link
Collaborator

jklj077 commented Dec 20, 2023

@sheiy 本repo中的openai_pai.py支持不了并发哈。如果有并发的需要,建议使用FastChat+vLLM,也可以提供OpenAI API类似的接口。

@jklj077 jklj077 transferred this issue from QwenLM/Qwen Dec 20, 2023
@sheiy
Copy link

sheiy commented Dec 22, 2023

@jklj077 多谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants