Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow table configuration at the system level to avoid issues with system tables #4537

Open
ctubbsii opened this issue May 8, 2024 · 0 comments
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects

Comments

@ctubbsii
Copy link
Member

ctubbsii commented May 8, 2024

The configuration hierarchy currently allows table configuration (table.* properties) to be set in the conf/accumulo.properties file (or on the command-line) and also in ZooKeeper at the system level. If set at either of these locations, they will apply to all tables in Accumulo, in all namespaces.

We have special exceptions for the metadata tables (or maybe all system tables?), to prevent certain table configurations from affecting the metadata table (I don't remember the details right now... maybe related to constraints?) because we know that they can be a problem. However, other table configuration can also be a problem, including setting iterators/filters, classloader contexts, etc. This can result in unexpected behavior, when a user clones a table and copies its configuration as well (raising the question: is it copying the effective configuration from the whole configuration hierarchy, or just that set on the table?).

To avoid many of these issues, we should disallow setting table configuration in a way that affects all tables. Namespaces are the appropriate place to configure table configuration to affect many tables at once, not the system level.

Table configuration should be ignored/disallowed from the SiteConfiguration (accumulo.properties and command-line) and the SystemConfiguration (in ZK). The shell should error when trying to set these, and any existing configuration should result in an ERROR or WARN message about it not having any effect.

We can add a warning about this in 2.1, and change the behavior in 3.1.

If there is still a use case for setting table properties in the accumulo.properties file, then we can consider adding that as a new feature to explicitly configure namespaces and/or tables, something along the lines of table[mynamespace.mytable].someProp=someValue to affect a specific table, and table[mynamespace].someProp=someValue to affect all tables in a namespace (table[].someProp=someValue to affect the default namespace). This is just one possibility. commons-configuration2, which we use for parsing the configuration, may have some useful features and a natural syntax for this, instead of the syntax I suggested here. It would be better to omit this feature entirely, if we don't actually need it. However, I think the main places where we might need it is for system tables on initial startup, to set certain things for the system tables, like per-table volume chooser, context, balancer, block cache configuration, HDFS replication factor, etc. that are useful to have set at system initialization, before any user tables are created. But, such things should be set on a specific namespace or table, not able to be set globally. So, maybe there's a way we can do that in the configuration instead of what we do now.

Another possibility is change the behavior so that such things only affect the accumulo system namespace (or only affect the default namespace), but I don't think that option is a good idea, because it could lead to a lot of confusion about what is affected, because it's a very different change in behavior for existing configuration. We should be very explicit in the ERROR/WARN messages about what is allowed, and the configuration should be very explicit about what we want it to affect, if we support it.

@ctubbsii ctubbsii added the enhancement This issue describes a new feature, improvement, or optimization. label May 8, 2024
@ctubbsii ctubbsii added this to To do in 3.1.0 via automation May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
3.1.0
To do
Development

No branches or pull requests

1 participant