Client-side caching support #402

craig-day · 2023-08-18T20:14:56Z

Motivation:

Redis has some commands and server side support for clients to implement client-side caching by using a key tracking approach.

This proposes adding support by defining a new CachingRedis interface that extends Redis, along with a RedisClientCache interface to act as the local store. The client implementation will then checkout and maintain a long-lived connection that will act in subscribe mode to listen for invalidations. This is the approach required for the RESP2 protocol, but technically isn't required for RESP3 since connections can multiplex data and response types. However, I chose to keep using this approach to keep things simple by leveraging a long-lived connection as a read stream in subscribe mode.

I also included a simple implementation of a local cache store, an LRU. I'd be happy to exclude this if we want consumers to be forced to supply their own, but I started with something to make it usable out of the box.

Conformance:

Your commits should be signed and you should have signed the Eclipse Contributor Agreement as explained in https://github.com/eclipse/vert.x/blob/master/CONTRIBUTING.md
Please also make sure you adhere to the code style guidelines: https://github.com/vert-x3/wiki/wiki/Vert.x-code-style-guidelines

craig-day · 2023-08-18T20:16:07Z

@pmlopes @vietj I opened this as a draft to get it on your radar, and I think the code is pretty much ready for review. I'm still working on adding additional test cases, but I don't plan to change the API or implementation except based on review feedback.

src/main/asciidoc/index.adoc

src/main/java/io/vertx/redis/client/CachingRedisOptions.java

src/main/java/io/vertx/redis/client/impl/CachingRedisClientImpl.java

src/main/java/io/vertx/redis/client/impl/cache/LRUClientSideCache.java

src/test/java/io/vertx/test/redis/CachingRedisClientTest.java

Ladicek · 2023-08-21T07:28:28Z

I don't feel qualified to review the caching code, but it seems like the 2 concurrent maps are assumed to be synchronized (not the Java language synchronized keyword), even though they in fact don't necessarily have to be. Implementing a cache is hard.

Also, not sure if this was intentional or not, but it seems kinda like a big deal: if you have a CachingRedisClient, call connect() on it, and use the resulting connection to perform operations, the cache will be completely bypassed. It seems to me that implementation-wise, caching should not be a client wrapper, it should be a connection wrapper. Caching on the client-level methods will come for free, as they just do connect().send() (or connect().batch()).

EDIT: I mean, there would still likely have to be a caching-aware client, which would maintain the cache itself and the invalidation connection. All connections created from this client would share the caching stuff. It likely doesn't have to be a new user-visible interface, though.

craig-day · 2023-08-21T15:58:40Z

@Ladicek thanks for all of the points, and I also personally had the debate about client vs. connection. I decided to start here with the biggest target and use-case being for the RedisAPI interface. In my opinion, when you are calling connect and interacting directly with a connection, I feel like you've acknowledged that you don't necessarily get extra features. You just a connection to a server in which you can send commands, i.e. the lowest level option. Caching felt more like a higher level abstraction that wasn't necessarily explicit to the connection, so I started at the "client" layer. As long as the consumer uses send() or passes the client to RedisAPI (which uses send) then they get the benefits. It felt similar to connection pooling. If you want to leverage the pool, you can just use send and the client will automatically checkout-send-release connections, but if you call connect then you have to deal with managing the sending and releasing.

I do agree the most robust solution would be to directly integrate with the connection implementation to configure tracking based on options, but I wasn't sure where the invalidations listener and cache manager would live in such a case. Without going down the road of verticles or workers or using the event bus, it felt easier to make everything explicit and opt-in with a direct client. I could perhaps remove the implements Redis on the client, and then not expose connect so there isn't a confusing case where you skip the caching you explicitly opted in to.

src/main/java/io/vertx/redis/client/CachingRedisClient.java

Ladicek · 2023-08-22T07:10:26Z

@craig-day OK, that seems like a valid decision, but I guess it should be documented then?

craig-day · 2023-09-21T17:39:40Z

@vietj @Ladicek I finally got back around to this. I've pushed a number of updates and I think this is ready for review.

Some notes on the design choices. I chose to stick with a wrapper for a few reasons:

We need somewhere to keep the long-lived invalidations connection.
This same place also needs to have access to the local store to invalidate keys as it is notified.
Additionally, there is some additional configuration needed to control how the client tracking gets configured when connections are opened.

I considered an alternative approach like creating a CacheManager that the client implementation could use, similar to the connection manager. The cache manager would maintain the long-lived connection and have access to the store, but to do this I would need to add an additional 2 arguments to each client constructor. One for the cache store, and another for the caching options. It would be nice to do this because we could then send the tracking configuration as the setup command for each connection, but since we only support a single setup command it means that this wouldn't work in the sentinel case since we already send a readonly command in some scenarios.

As an optimization, however, I did make the connections somewhat cache-aware. Each connection tracks whether it has had client tracking configured. This is done to avoid sending 2 commands for every command a client sends. Once a connection has had client tracking configured, it doesn't need to change unless it gets explicitly turned off. On that note, currently a client could "break things" if they were to enable caching, checkout a connection, and then send the CLIENT TRACKING off command at some point. I don't know if we want to handle this explicitly, or even document it, but it is something that could happen.

craig-day · 2023-09-21T18:56:10Z

@Ladicek I think the last commit I just pushed is doing most of what you wanted. I removed the client creation away from the wrapper and into the default Redis.createClient methods. To do this I brought the CachingRedisOptions into RedisOptions in a similar way to the PoolOptions and NetClientOptions nested structures. Now to get caching you just enable the option. Doing this required a change in how custom cache implementations are provided, so that had to move to a ServiceLoader based pattern.

simplify cache impls; some javadocs and tests more javadocs; invalidation handler on client basic support for max age in default impls initial PR feedback; naming consistencies checkpoint on wrapped connection push tracking status down to actual conn impl simplify cache impls to just LRU backed by LinkedHashMap cleanup client wrapper; test simple operation docs update make caching configured via options, not an explicit wrapper regenerate options converters