Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis Cluster Scan Picks Random Node #345

Open
cdennison opened this issue Jul 14, 2022 · 3 comments
Open

Redis Cluster Scan Picks Random Node #345

cdennison opened this issue Jul 14, 2022 · 3 comments
Labels

Comments

@cdennison
Copy link

Version

4.3.3-SNAPSHOT

Context

Previously scan while using Redis cluster was disabled - see this commit that recently enabled it:
f43502b.

Do you have a reproducer?

A simple test case like this under RedisClusterTest.java

@Test(timeout = 30_000)
  public void testScan() {
    client
      .connect(onCreate -> {
        should.assertTrue(onCreate.succeeded());

        final RedisConnection cluster = onCreate.result();
        cluster.exceptionHandler(should::fail);
        
        cluster.send(cmd(SCAN).arg(0).arg("MATCH").arg("*").arg("COUNT").arg(100), val -> {
          System.out.println(val);
          });
      });
  }

Will show that for scan it hits this line and picks a node at random:

https://github.com/vert-x3/vertx-redis-client/blob/master/src/main/java/io/vertx/redis/client/impl/RedisClusterConnection.java#L158

I assume that the reason commands like "keys" run on all nodes but not "scan" is because scan has a cursor so it would be complex to iterate on all nodes but I can't imagine why you would want to run "scan" randomly. I was hoping to explore using a clustered scan with hash tags and thinking maybe you could provide the slot when calling "send."

@cdennison cdennison added the bug label Jul 14, 2022
@cdennison cdennison changed the title Redis Scan Picks Random Node Redis Cluster Scan Picks Random Node Jul 14, 2022
@pmlopes
Copy link
Member

pmlopes commented Jul 15, 2022

Hi @cdennison the goal with the commit you mention was to respect what redis states about commands, for SCAN we can see:

127.0.0.1:6379> command info scan
1)  1) "scan"
    2) (integer) -2
    3) 1) readonly
    4) (integer) 0
    5) (integer) 0
    6) (integer) 0
    7) 1) @keyspace
       2) @read
       3) @slow
    8) 1) "nondeterministic_output"
       2) "request_policy:special"
    9) (empty array)
   10) (empty array)

Which says that there are keys involved with the command so, we assume it can be run on any node.

I believe we need to extend the metadata with the tips field:

https://redis.io/docs/reference/command-tips/

And assume that any command with tip: request_policy:special should be tagged as not allowed.

@cdennison
Copy link
Author

Hi @pmlopes

I agree that any command with request_policy:special should not be allowed to run in a clustered client unless "special" care has been taken to implement it correctly. I suggest you simply roll back that previous change to not allow "SCAN." In terms of the other commands that are now recently allowed they all look fine to me.

ASKING, AUTH, BGREWRITEAOF, BGSAVE, CLIENT, COMMAND, CONFIG,
DEBUG, DISCARD, INFO, LASTSAVE, LATENCY, LOLWUT, MEMORY, MODULE, MONITOR, PFDEBUG, PFSELFTEST,
PING, READONLY, READWRITE, REPLCONF, REPLICAOF, ROLE, SAVE, SCAN, SCRIPT, SELECT, SHUTDOWN, SLAVEOF, SLOWLOG, SWAPDB,
SYNC, SENTINEL

You can see here that the only command that is "special" is SCAN https://github.com/redis/redis-doc/blob/master/commands.json#L10964.

Also from what I've understood, the way SCAN might in the future be implemented to work correctly with a Clustered client would be if it used hash tags. I can't find any actual example of this but my understanding is that you could do the following:

set {accountid123}.randomkey1 value
set {accountid123}.randomkey2 value
set {accountid456}.randomkey3 value
etc.

This will make sure that all {accountid123} keys end up on the same node.

Then lets say you want to SCAN for accountid123 you could do:

get {accountid123} which will tell you the right node to use for your scan because it is the hash tag.

If you randomly SCAN then you may or may not find the accountid123 keys.

@cdennison
Copy link
Author

@pmlopes fyi I'm now using SCAN successfully in Production with a cluster like this (I had to add a hash key to all my records) - this basically just says treat "scan" like a "get" etc and use the key in position "2"

The idea is that when you get the slot for something like this {123}blahblahetc - the library correctly only looks at the "123" which is the hash

io.vertx.redis.client.Command vertxSCAN = new CommandImpl("scan", -2, true, false, false, new KeyLocator(true, new BeginSearchIndex(3), new FindKeysRange(0, 1, 0)));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants