Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Reduce "qps" and "max token 384" error #58

Closed
wants to merge 3 commits into from

Conversation

lanyeeee
Copy link

@lanyeeee lanyeeee commented Oct 31, 2023

使用Polly实现重试策略
如果遇到 qps ,则 1 秒后再重试,最多重试 3 次,如果重试 3 次后仍然出现 qps 异常,则抛出异常
在 Embedding 前,先将 kernel-memory 解析的数据分成至多16段,以减少 max token 384 异常

@xbotter xbotter self-requested a review November 1, 2023 04:27
- In GenerateEmbeddingsAsync, the data from kernel-memory will be divided into up to 16 parts before request
- Add huge pdf sample
@lanyeeee lanyeeee changed the title feat: Use Polly to implement retry to avoid qps error feat: Reduce "qps" and "max token 384" error Nov 1, 2023
@lanyeeee lanyeeee closed this by deleting the head repository Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant