Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Pipeline deduplication #98

Open
thedodd opened this issue Sep 26, 2021 · 0 comments
Open

Automatic Pipeline deduplication #98

thedodd opened this issue Sep 26, 2021 · 0 comments
Labels
A-clients Hadron clients A-pipelines Hadron server pipelines
Projects

Comments

@thedodd
Copy link
Collaborator

thedodd commented Sep 26, 2021

Pipelines should deduplicate events from the Source stream as they are instantiated. This would ensure that there are no duplicate Pipeline instances run for a given partition. However:

  • This does not guard against valid retransmission of a root event though, and should not.
  • This does not guard against cases where an event was duplicated across different partitions.

Old proposal (invalid):

In cases where a duplicate root event has been written to the source Stream of a Pipeline, it may be convenient for users to simply return a Skip variant in the response payload, instructing Hadron to cancel the remainder of the Pipeline for that root event.

Is it currently is without this feature, users simply need to model this by returning events of a specific type indicating that the Pipeline overall is a no-op or the like.

To implement this, we will need to make some modeling changes to the way Pipeline instances (an execution of a Pipeline over a specific root event) are modeled.

Currently there is no real state associated with them, they are tracked via metadata offsets and the like. This Pipeline state monitoring will be perfect for monitoring of Pipelines for the future metrics and monitoring UI.

The above is an invalid argument as a Pipeline handler has no way to discern if the duplicate is just a retransmission of the original root event or a new event being processed at a later point which has duplicate identity.

@thedodd thedodd created this issue from a note in Main (Backlog) Sep 26, 2021
@thedodd thedodd added A-clients Hadron clients A-pipelines Hadron server pipelines labels Sep 26, 2021
@thedodd thedodd changed the title Allow Pipeline stage response to "skip" processing Automatic Pipeline deduplication Sep 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-clients Hadron clients A-pipelines Hadron server pipelines
Projects
Main
Backlog
Development

No branches or pull requests

1 participant