New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option for balancing a single table at a time to the host regex balancer #4521
Comments
This might work but definitely needs some experimenting as you said. I'm thinking through this and a concern I have with this approach is you might end up with similar churn as depending on the table balancer algorithm (whether it's the default I was trying to think through a worst case example, let's say you have a few tables and the balancing starts.
How bad the churn is will depend on how likely the table balancing algorithm will re-trigger tables to rebalance again based on the future tables in the sorted order being balanced. The default I think an interesting approach that would be nice if it wasn't too hard to implement would be to try and run the algorithm as a dry run without actually performing migrations and just gathering the list of migrations. Then you could take that set of migrations and run the algorithm again. You could have it configurable to run through some number of iterations before actually performing work. This may be a bigger change though and not sure what other consequences would occur but could also be worth a shot. |
I was wondering if instead of going in a predictable order if the algorithm picked a random start point? Maybe treat the list of tables (by id?) as a ring buffer, and then start somewhere random in the buffer and walk-through the candidates until you reach the start point? This may help preventing getting stuck on tables that sort first. An alternative may be to gather a count of all tables that need migration and then work them off in order, starting with the largest counts. Would this help it converge faster? |
I'm working on a draft of this change now to use for testing. |
Another "feature" may be to prioritize system tables - particularly the metadata table to make sure that the system tables are balanced first. |
@EdColeman - Those are some interesting ideas as well, I think this may be something where we just need to test a couple approaches and see what works the best |
Is your feature request related to a problem? Please describe.
The host regex balancer calls multiple per table balancers. Each per table balancer may make balancing decision based on the current state of its tablets and the current state of all other tablets. When the decision of multiple per table balancers are all executed at once it may cause churn because their assumptions about other tables are partially invalidated. If the host regex balancer could optionally focus on one table at a time it may help avoid or lessen this churn.
Describe the solution you'd like
Add option to the host reg ex balancer that enables balancing a single table at time. This option would be off by default. The behavior of this option would be as follows for this code section.
Need to experiment with this solution to see if it helps with churn.
The text was updated successfully, but these errors were encountered: