New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow system compactions to run if zero user compaction jobs have run #4480
Conversation
This change will allow system compactions to postpone user compactions that have had no jobs run yet. Before this, if a user compaction was in the queue and had selected files that overlapped it would block system compactions from running. Now if there are selected files, but the user compaction is not running and hasn't had any jobs completed, the coordinator will clear the selectedFiles column so that the system compaction can run. The fate operation will reset the column again while trying to make progress. This closes apache#4454
I started to work on a follow on to this PR in another branch that is based on the SteadyTime changes from #4494 and is here: cshannon@1997703 It's not tested and just a rough draft of how we could possibly use SteadyTime. I was unsure about a couple of things, first the commit updates CompactionJobGenerator but I'm not sure if we need to also update CompactionCoordinator to use SteadyTime when dealing with selected files. Second, I wasn't sure if the expiration should be an independent check or tied to if compaction jobs have run (ie only expire if compaction jobs run is greater than 0 |
I merged in these changes after SteadyTime was merged into elasticity, the same comments/questions apply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking at this, posting some comments I have made so far.
It's not tested and just a rough draft of how we could possibly use SteadyTime. I was unsure about a couple of things, first the commit updates CompactionJobGenerator but I'm not sure if we need to also update CompactionCoordinator to use SteadyTime when dealing with selected files. Second, I wasn't sure if the expiration should be an independent check or tied to if compaction jobs have run (ie only expire if compaction jobs run is greater than 0
CompactionCoordinator should probably use steady time, that would cover the following situation.
- CompactionJobGenerator generates a system compaction based on selected files that are past expiration
- The coordinator gets this job for reservation and comes to the conclusion that it can reserve the files in the job even though selected because they are passed expiration
I made some suggestions around expired selected files in the coordinator.
core/src/main/java/org/apache/accumulo/core/metadata/schema/SelectedFiles.java
Outdated
Show resolved
Hide resolved
server/base/src/main/java/org/apache/accumulo/server/compaction/CompactionJobGenerator.java
Outdated
Show resolved
Hide resolved
server/base/src/main/java/org/apache/accumulo/server/compaction/CompactionJobGenerator.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/accumulo/manager/compaction/coordinator/CompactionCoordinator.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/accumulo/manager/compaction/coordinator/CompactionCoordinator.java
Outdated
Show resolved
Hide resolved
server/base/src/main/java/org/apache/accumulo/server/compaction/CompactionJobGenerator.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/accumulo/manager/compaction/coordinator/CompactionCoordinator.java
Outdated
Show resolved
Hide resolved
selectionsSubmitted.put(tablet.getExtent(), filesToCompact); | ||
|
||
// TODO: Do we need to handle a race condition for the rejection handler check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re this comment. Could relax the check in the rejection handler to tabletMetadata.getSelectedFiles().getFateId().equals(fateId) || tabletMetadata.getCompacted().contains(fateId)
if either of those are true then write when through, but could have changed since to compact some or even completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright so the full rejection handler should just look like the following? Just want to make sure i have it right.
mutator.submit(tabletMetadata -> tabletMetadata.getSelectedFiles() != null
&& tabletMetadata.getSelectedFiles().getFateId().equals(fateId) ||
tabletMetadata.getCompacted().contains(fateId));
This change will allow system compactions to postpone user compactions that have had no jobs run yet. Before this, if a user compaction was in the queue and had selected files that overlapped it would block system compactions from running. Now if there are selected files, but the user compaction is not running and hasn't had any jobs completed, the coordinator will clear the selectedFiles column so that the system compaction can run. The fate operation will reset the column again while trying to make progress.
The current test works by creating two tables and setting compactions to be slow using the slow iterator and also setting system compactions to be disabled. The test will start compacting one table so the compactor is busy. Next a user compaction is started that will waiting in the queue and then the system compactions are re-enabled so a system compaction will be scheduled. A custom test planner is set so that system compactions take higher precedence. This means that when the compation coordinator goes to start the next compaction job, it will try and grab the system compaction ahead of the user compaction. Before this change this would have been blocked by the selectedFiles column being set by the user compaction, however because the user compaction has had no jobs run it can proceed and clear the column and run the system compaction. Later the fate operation will re-set it again and the user compaction will run second.
This closes #4454