import flash_attn rotary fail #288

cnahmgx · 2024-01-30T09:43:19Z

背景：在本地部署ModelScope-Agent-7B，机器为nvidia的A100，速度特别慢，chat一次平均耗时18秒
已经按照[https://modelscope.cn/models/iic/ModelScope-Agent-7B/summary的步骤安装了flash-attention==2.3.5、layer_norm、rotary-embedding-torch==0.5.3]
启动ModelScope-Agent-7B
还是报：Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary

cnahmgx · 2024-02-20T05:53:42Z

帮忙看看呢

mushenL · 2024-02-23T03:27:54Z

您好，最近modelscope-agent更新了版本，建议使用最新版本，关于本地部署的问题，可以参考这个下面这个关于无外网环境部署的方案
https://github.com/modelscope/modelscope-agent/pull/307/files

wenmengzhou assigned suluyana Jan 31, 2024

zzhangpurdue added llm issues about llm usage legacy issues caused by legacy codes labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import flash_attn rotary fail #288

import flash_attn rotary fail #288

cnahmgx commented Jan 30, 2024

cnahmgx commented Feb 20, 2024

mushenL commented Feb 23, 2024

import flash_attn rotary fail #288

import flash_attn rotary fail #288

Comments

cnahmgx commented Jan 30, 2024

cnahmgx commented Feb 20, 2024

mushenL commented Feb 23, 2024