New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memoize with lease #178
Comments
Interesting. It's been a while since I looked at this code. But I think I understand it. The "expire" is a hard deadline but the "lease" is like a soft deadline. Once the "lease" is passed a thread is started to update the cache item. If the api isn't called in a long time, values will still expire and cause a delay, right? Or do you set the lease to something short (like a minute) and the expiration to something like (like a week)? Also, did you read through http://www.grantjenks.com/docs/diskcache/case-study-landing-page-caching.html ? It would be helpful to see Concurrency and Latency charts with this decorator. |
The term "lease" is new to me and I like it. Are there similar cache APIs? |
Yes, this is the main idea, I can set it to a very high value if we are sure it will be called in the mean time, so expiring will be updated and the cache entry almost never expire. The data are openings calendars for public places. The schedule is decided some days before. But sometimes there are updates in the current week, so I'll use 5 minutes because I can't know when changes will happen. Once a week, I'll run a full update of data, so I'm sure I'm updated and aligned at the begin of the week.
This page was my starting point but I haven't tested as you. I think the bottleneck can be the number of concurrent threads to calculate different weeks for different public place at the same time but usually this doesn't happen. I'm thinking of generalize it using 2 functions as a parameter, one for the timer (maybe with a different name) and the condition check (in this case "Lease" came from dhcp protocol, where the lease is assigned for some time and then renew after expiring. |
I don't know. I'm using diskcache because it is very fast, simple and handle concurrency very well. Actually I've tested it with this unicorn app with 16 threads and it spans tens of remote calls which are saved using diskcache without any glitch. Very nice python module! |
I've only just came across this module (thanks Python Bytes podcast for the heads up!) and I have a similar use case where my application makes remote API calls that are time costly, diskcache will resolve these, but the cached data needs to have a definable lifetime so that it can be refreshed with current data , love the idea of lease btw! |
Hi @yurj , I made a couple small changes and opened a pull request. Feel free to open your own if you've made more modifications. |
Can this be rewritten using asyncio and avoid to use threading? It should be similar to #196. |
Hi!
use case:
a web API. The API must return values as soon as possible, and values should be updated if some time is passed from last call. So the api is fast and has data quite updated. The data depend on an external service which can take from 1 second to several seconds to compute. Data is collected at the begin of the week using the external service and it rarely changes but when it happen, nice to have it displayed in some minutes.
memoize_stampede was not suited because when the cache expire, the caller has to wait all the recomputation thus the API would be slow. So I modified stampede a little bit, removing the euristic and adding a lease time parameter. It is not perfect because if the service is not called from sometime, it will display the cached value and then update it, but being called from Javascript in some home pages this should not happen often and changes in data are rare, so not a problem.
here the code (nice if it will be included on the recipes!):
The text was updated successfully, but these errors were encountered: