Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import flash_attn rotary fail #288

Open
cnahmgx opened this issue Jan 30, 2024 · 2 comments
Open

import flash_attn rotary fail #288

cnahmgx opened this issue Jan 30, 2024 · 2 comments
Assignees
Labels
legacy issues caused by legacy codes llm issues about llm usage

Comments

@cnahmgx
Copy link

cnahmgx commented Jan 30, 2024

背景:在本地部署ModelScope-Agent-7B,机器为nvidia的A100,速度特别慢,chat一次平均耗时18秒
已经按照[https://modelscope.cn/models/iic/ModelScope-Agent-7B/summary的步骤安装了flash-attention==2.3.5、layer_norm、rotary-embedding-torch==0.5.3]
启动ModelScope-Agent-7B
还是报:Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary

@cnahmgx
Copy link
Author

cnahmgx commented Feb 20, 2024

帮忙看看呢

@mushenL
Copy link
Collaborator

mushenL commented Feb 23, 2024

您好,最近modelscope-agent更新了版本,建议使用最新版本,关于本地部署的问题,可以参考这个下面这个关于无外网环境部署的方案
https://github.com/modelscope/modelscope-agent/pull/307/files

@zzhangpurdue zzhangpurdue added llm issues about llm usage legacy issues caused by legacy codes labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legacy issues caused by legacy codes llm issues about llm usage
Projects
None yet
Development

No branches or pull requests

4 participants