Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory cache never compacts expired keys with a stable set of cache keys #1993

Open
dudleycarr opened this issue Jul 6, 2023 · 2 comments · May be fixed by #2568
Open

Memory cache never compacts expired keys with a stable set of cache keys #1993

dudleycarr opened this issue Jul 6, 2023 · 2 comments · May be fixed by #2568
Labels
annoying Benthos is mildly annoying but not quite a bug caches Any tasks or issues relating to cache resources

Comments

@dudleycarr
Copy link
Contributor

We're using Benthos to periodically monitor and process a set of IDs. To prevent needless reprocessing, we're caching IDs for some period of time. If the ID wasn't successfully processed downstream after the cache key TTL, we expect the ID to have been purged from the cache and be reprocessed.

Minimal configuration:

input:
  generate:
    interval: 1s
    mapping: root.key = "a"
pipeline:
  threads: -1
  processors:
    - cache:
        resource: mycache
        operator: add
        key: '${! json("key") }'
        value: "foo"
        ttl: 10s
    - mapping: root = if errored() { deleted() }
output:
  label: ""
  stdout:
    codec: lines
cache_resources:
  - label: mycache
    memory:
      compaction_interval: 1s

Expected output after 10s

{"key":"a"}
{"key":"a"}

Actual output after 10s

{"key":"a"}

The current work around is to always set a key on each pass through the pipeline to force a compaction for expired cache entries.

@Jeffail
Copy link
Member

Jeffail commented Jul 8, 2023

Hey @dudleycarr, yeah that's annoying, ideally we should have a background process that does occasional compactions.

We currently document "The compaction interval determines how often the cache is cleared of expired items, and this process is only triggered on writes to the cache", so changing it is technically a breaking change but we could maybe add a field enable_background_compactions: true to change that.

@Jeffail Jeffail added annoying Benthos is mildly annoying but not quite a bug caches Any tasks or issues relating to cache resources labels Jul 8, 2023
@kmpm
Copy link

kmpm commented May 7, 2024

How about, in the Add func check if isExpired?
Basically in internal/impl/pure/cache_memory.go changing

	if _, exists := shard.items[key]; exists {
		shard.Unlock()
		return service.ErrKeyAlreadyExists
	}

to

	if k, exists := shard.items[key]; exists && !shard.isExpired(k) {
		shard.Unlock()
		return service.ErrKeyAlreadyExists
	}

An Add should perhaps be considered to be a sort of write so the the documentation is still sort of correct.
Writing a key that should have been cleared is also a kind of write :)

I'll gladly create a PR.

@kmpm kmpm linked a pull request May 7, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annoying Benthos is mildly annoying but not quite a bug caches Any tasks or issues relating to cache resources
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants