We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我们有一些多模态模型,比如nougat是由一个vision encoder模型和llm decoder模型组成的. 其中encoder模型就是传统的cv模型,类似于vit用于提取图像的特征为encoder_hidden_feature,然后再传入decoder中,这个时候decoder中开始传入初始input_id和encoder_hidden_feature,decoder中会有cross attention的部分; encoder部分可以忽略,主要是decoder部分,这部分支持 persistent batch 吗,这个decoder的输入对比传统的llm-decoder还会额外有 encoder_hidden_feature 输入,会在decoder中进行cross attention。 目前static batching在trt-llm可以的,但是如果想要提升性能,想问lmdeploy是否支持类似这种decoder的persistent batch?
No response
The text was updated successfully, but these errors were encountered:
lmdeploy 没有支持 enc-dec 模型。中短期来看,也没有这方面的规划
Sorry, something went wrong.
大概看了一下 nougat 不知道有没有理解对。看起来是最开始一段的 KV 来自 encoder model,需要把 encoder 输出的 KV 填到 KV cache 中,然后再使用 decoder 生成?
@lvhan028 @lzhangzz 感谢回复,在nougat中,encoder输出的feature会和初始input_ids一同传入decoder中,在docoder内部是这么操作的: 这里有两个kv cache以及两个attn
大概了解了,看了一下短期内应该支持不了 😅
No branches or pull requests
Motivation
我们有一些多模态模型,比如nougat是由一个vision encoder模型和llm decoder模型组成的.
其中encoder模型就是传统的cv模型,类似于vit用于提取图像的特征为encoder_hidden_feature,然后再传入decoder中,这个时候decoder中开始传入初始input_id和encoder_hidden_feature,decoder中会有cross attention的部分;
encoder部分可以忽略,主要是decoder部分,这部分支持 persistent batch 吗,这个decoder的输入对比传统的llm-decoder还会额外有 encoder_hidden_feature 输入,会在decoder中进行cross attention。
目前static batching在trt-llm可以的,但是如果想要提升性能,想问lmdeploy是否支持类似这种decoder的persistent batch?
Related resources
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: