-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] provide clarity on the cache that flyte propeller uses #4611
Comments
🤖 Hey @samhita-alla! Nice to meet you! I'm Dosu, a friendly bot here to assist you while we wait for a human maintainer to step in. I'm designed to help you solve bugs, answer questions, and generally make your experience smoother. Don't worry, we'll look into this caching issue together. Let's get started! The cache that FlytePropeller uses is a write-through cache that sits on top of the metastore. It is used to optimize read and write latency for all metadata IO operations. Metadata here refers to the input and output pointers. The cache can be configured in the The cache is also used to minimize the number of redundant loops in FlytePropeller, thus improving free slots. This is done by using a trick in etcD where for every write, FlytePropeller records the last known version number in the database and then tries to wait for the change to propagate to the informer cache. Another area of slowdown could be the size of the input-output cache that FlytePropeller maintains in-memory. This can be configured while configuring the storage for FlytePropeller. The rule of thumb is to allocate half of the memory limit of FlytePropeller to the cache. Here is the relevant excerpt from the documentation: * - ``storage.cache``
- propeller
- This config is used to configure the write-through cache used by FlytePropeller on top of the metastore
- FlytePropeller uses the configure blob-store (can be changed to something more performant in the future) to optimize read and write latency, for all metadata IO operations. Metadata refers to the input and output pointers And: Another area of slowdown could be the size of the input-output cache that FlytePropeller maintains in-memory. This can be configured, while configuring
the storage for FlytePropeller. Rule of thumb, for FlytePropeller with x memory limit, allocate x/2 to the cache Sources: The write-through cache mechanism in FlytePropeller is implemented in the
var buf bytes.Buffer
teeReader := io.TeeReader(raw, &buf)
err := s.RawStore.WriteRaw(ctx, reference, size, opts, teeReader)
if err != nil {
return err
}
err = s.cache.Set([]byte(reference), buf.Bytes(), neverExpire)
if err != nil {
s.metrics.CacheWriteError.Inc()
err = errors.Wrapf(ErrFailedToWriteCache, err, "Failed to Cache the metadata")
}
This mechanism ensures that the data is always written to the underlying storage and the cache simultaneously, which can improve read performance for frequently accessed data. I hope this provides the clarity you were looking for. If you have any more questions, feel free to ask. Best, SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Description
reference slack thread: https://discuss.flyte.org/t/16095263/please-could-someone-explain-which-cache-the-docs-are-referr#a02e379e-89dd-4fd9-ab49-df6b29bbcb65
doc page to update: https://docs.flyte.org/en/latest/deployment/configuration/performance.html#optimize-flytepropeller-configuration
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: