Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to keep smartcontext option #837

Open
Nabokov86 opened this issue May 10, 2024 · 9 comments
Open

Request to keep smartcontext option #837

Nabokov86 opened this issue May 10, 2024 · 9 comments

Comments

@Nabokov86
Copy link

Regarding this change: 'Deprecated some old flags'.

Might it be possible to preserve the smartcontext option instead of removing it? I find it particularly useful for my workflow.

@henk717
Copy link

henk717 commented May 11, 2024

What does smartcontext allow you to do that context shifting doesnt?

@LostRuins
Copy link
Owner

Adding on to what henk said, for GGUF models context shift is a strict upgrade, smartcontext is only useful for old models that don't support it.

And context shift can be disabled with --noshift

@Nabokov86
Copy link
Author

Nabokov86 commented May 11, 2024

Smart context is significantly faster in certain scenarios.

  • It's extremely useful when loading large text. Instead of processing the entire context, it only processes a portion of it.
  • Smart context also allows safely editing of previously generated text without worrying about the entire context being reprocessed.

For example, I use an 8K model with my chat assistant and store my chat history in a single JSON file. With context shifting, it would process the entire 8K context every time I start a conversation, which results in painfully slow generation. In contrast, smart context only processes a portion of the context, making it faster both during processing and generation.

@Nabokov86
Copy link
Author

I also adjusted the default SCTruncationRatio value to only process 20% with smartcontext. This suits my needs perfectly.

While I require 8K context for generation, I don't want the entire 8K processed at once. With context shifting I can't achieve this.

In my opinion, smart context has some benefits in certain use cases.

@LostRuins
Copy link
Owner

But the question is, how is it preferable to context shift, which is just as good but even faster? That option allows for zero reprocessing but without losing any context at all.

@Nabokov86
Copy link
Author

@LostRuins Context shifting isn't faster for my use case. With context shifting, it processes the entire 8K context and then continues generating at 8K, which is much slower. In contrast, SmartContext only processes a certain amount of text (the last 20% in my scenario) and continues generating at around 1.5K.

As a result, smart context is much faster for me, both during processing and generation, as I don't need to process the entire 8K in the first place.

Additionally, if I remove or modify a chunk of text with context shift, it will cause the entire 8K context to be reprocessed, which is frustrating.

By the way, could you explain why you decided to remove it? It's still available for some models, right? If you believe that smartcontext is inferior, I understand hiding this flag from the help page or advising against its use. However, keeping the functionality available for all models, seems reasonable. It would be beneficial to have a choice between the two.

@henk717
Copy link

henk717 commented May 11, 2024

Hiding it is basically what he did. The flag should still work at least at the moment. The issue with context shift is that it cuts your effective context in half so if you set it to 8K once your context limit is reached it really just becomes 4K. Thats something most users don't want. You could experiment just putting it at 4K because it should give the same effect.

@LostRuins
Copy link
Owner

fuck it, fine, i'll revert the smartcontext flag and add it back

@aleksusklim
Copy link

Users want two things:

  1. Fast loading of old history for which a cache should be implemented somehow; [Feature request] Ability to cache context between runs for faster initial generation of the same history (after app restart) #445
  2. Reliable editing of old turns without reprocessing as ContextShift does occasionally.

For the second point, here is what you can do:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants