Shared-prefix rope issue #194

lkc1997 · 2024-04-01T09:07:15Z

I found that during shared-prefix calculation, this kenerl won't use qo_indptr to split batch queries which may cause rope error.

yzh119 · 2024-05-07T00:42:32Z

Hi @lkc1997 , we have another few arguments (rope_position to specify the rope position of each query) in our C++ APIs but we have ported them into PyTorch APIs, I'll do that in next release.

You can process rope outside attention and use attention kernel without rope at the moment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared-prefix rope issue #194

Shared-prefix rope issue #194

lkc1997 commented Apr 1, 2024 •

edited

yzh119 commented May 7, 2024

Shared-prefix rope issue #194

Shared-prefix rope issue #194

Comments

lkc1997 commented Apr 1, 2024 • edited

yzh119 commented May 7, 2024

lkc1997 commented Apr 1, 2024 •

edited