Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

这个需要什么配置合适?用一张A100 显卡跑的7B模型,80G显存用了10G,回答case中的怎么去北京 要60秒才返回结果 #310

Open
MetaRunning opened this issue Apr 3, 2024 · 2 comments

Comments

@MetaRunning
Copy link

RT

@Rayrtfr
Copy link
Collaborator

Rayrtfr commented Apr 22, 2024

RT

使用加速方法,简单的可以试试vllm:https://github.com/LlamaFamily/Llama-Chinese/tree/main/inference-speed/GPU/vllm_example

@MetaRunning
Copy link
Author

RT

使用加速方法,简单的可以试试vllm:https://github.com/LlamaFamily/Llama-Chinese/tree/main/inference-speed/GPU/vllm_example

好的,多谢。我试下看下效果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants