Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] 是否支持enc-dec类型模型中decoder的persistent batch #1581

Open
Oldpan opened this issue May 10, 2024 · 4 comments
Open

[Feature] 是否支持enc-dec类型模型中decoder的persistent batch #1581

Oldpan opened this issue May 10, 2024 · 4 comments

Comments

@Oldpan
Copy link

Oldpan commented May 10, 2024

Motivation

我们有一些多模态模型,比如nougat是由一个vision encoder模型和llm decoder模型组成的.
其中encoder模型就是传统的cv模型,类似于vit用于提取图像的特征为encoder_hidden_feature,然后再传入decoder中,这个时候decoder中开始传入初始input_id和encoder_hidden_feature,decoder中会有cross attention的部分;
encoder部分可以忽略,主要是decoder部分,这部分支持 persistent batch 吗,这个decoder的输入对比传统的llm-decoder还会额外有 encoder_hidden_feature 输入,会在decoder中进行cross attention。
目前static batching在trt-llm可以的,但是如果想要提升性能,想问lmdeploy是否支持类似这种decoder的persistent batch?

Related resources

No response

Additional context

No response

@lvhan028
Copy link
Collaborator

lmdeploy 没有支持 enc-dec 模型。中短期来看,也没有这方面的规划

@lzhangzz
Copy link
Collaborator

lzhangzz commented May 11, 2024

大概看了一下 nougat 不知道有没有理解对。看起来是最开始一段的 KV 来自 encoder model,需要把 encoder 输出的 KV 填到 KV cache 中,然后再使用 decoder 生成?

@Oldpan
Copy link
Author

Oldpan commented May 16, 2024

@lvhan028 @lzhangzz 感谢回复,在nougat中,encoder输出的feature会和初始input_ids一同传入decoder中,在docoder内部是这么操作的:
image
这里有两个kv cache以及两个attn

@lzhangzz
Copy link
Collaborator

大概了解了,看了一下短期内应该支持不了 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants