-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
源码复现过程中出现很多问题 #187
Labels
bug
Something isn't working
Comments
这个分支没有进行server的测试,可以看看跑test有没有问题 |
那请问目前是只能运行Readme中的Static inference performance部分嘛(kvoff分支)? |
是的,因为serving性能受限,我们没有进一步实现
…---原始邮件---
发件人: ***@***.***>
发送时间: 2023年11月1日(周三) 晚上8:03
收件人: ***@***.***>;
抄送: "Yunqian ***@***.******@***.***>;
主题: Re: [ModelTC/lightllm] 源码复现过程中出现很多问题 (Issue #187)
那请问目前是只能运行Readme中的Static inference performance部分嘛(kvoff分支)?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
我在Huggingface官网上下载了Chinese-LLaMA-2-1.3B模型,而后运行
此问题的与这个issue类似,但这个issue里并没有详细的解决方案 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
LightLLM运行过程
复现kvoff分支
第一步:创建docker
拉取镜像:
docker pull ghcr.io/modeltc/lightllm:main
llama-7b模型过大,在服务器的docker中直接clone总是发生网络中断,因此我将该模型下载到本地,通过Xftp传输到服务器中,而后在创建docker时将模型文件夹映射到lightllm源码的models文件夹中。
模型仓库:[huggyllama/llama-7b · Hugging Face](https://huggingface.co/huggyllama/llama-7b)
第二步:运行
源码安装:
模型运行:
错误信息——OOM
后将
max_total_token_num
的值从120000改为6000,OOM错误消失,但有发生了下面的错误(每次总是在下面三种错误中随机出现一种)。在Google上搜索了类似错误,但并没有解决。1
2
3
api_server无法运行:
The text was updated successfully, but these errors were encountered: