Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Pass total number of active replicas as a context variable #2889

Open
1 of 3 tasks
dberardo-com opened this issue Apr 12, 2023 · 6 comments
Open
1 of 3 tasks

Comments

@dberardo-com
Copy link

Feature Type

  • Adding new functionality to Nuclio

  • Changing existing functionality in Nuclio

  • Removing existing functionality in Nuclio

Problem Description

In order to do load balancing and deduplication it might be useful for each replica of the same function to know:

  • how many other replicas are there
  • which is "my sequence number" inside the set of replicas

Feature Description

let's assume that replica A and B both need to register to the same MQTT topic and cannot use shared subscriptions (because they need to react differenly to the same messages), then in order to prevent that A reacts to all messages, like B would, it could be useful that A picks up only the "odd messages" and B the "even" ones.

the same principle would apply of course for 3,4, 5... replicas, using the <sequence_id> % <total_num_replicas> logic to deduplicate.

the total_num_replicas of course needs to change if the number of replicas is scaled up and down, to allow rebalancing of subscriptions inside the existing replicas

Alternative Solutions

i am not aware of any

Additional Context

No response

@liranbg
Copy link
Contributor

liranbg commented Apr 13, 2023

Hi @dberardo-com
The max replicas is given as part of the function config, which is available from the container.
having said that, giving each pod its "serial" number is anti pattern to function pod being "stateless". K8s support that only using statefulset, which is not what we use in nuclio.

what you are trying to find here is a "locking" mechanism.

@dberardo-com
Copy link
Author

maybe this can be seen as a lock, but the functions themselves are still stateless. Acutally i still see this as a deduplication problem, similar to what one would have with kafka streams and others. Basically i am trying to introduce an home-made version of kafka's "consumer groups" if you will.

one use case i am thinking of are Jobs, basically one would have a queue of jobs and replicas would only pick jobs whose id is a muliple of their sequence number. The mechanism is quite robust if the number of replicas is fixed and does not change.

The max replicas is given as part of the function config, which is available from the container.

is this part of the context object? or could the replica access this value? and is there also a min replicas equivalent? --> this could be used as a workaround to get the total number of replicas

As for the serial number: i know that there is a "worker_id" somewhere in nuclio but i am not sure how that works and whether one could use it as serial_number

@liranbg
Copy link
Contributor

liranbg commented Apr 13, 2023

worker id is unique per id and thus could be use a serial number. usage: context.worker_id.
the number of replicas is given by the function config, which is mounted to the container and can be read during init_context
what is left here, is to provide the mechanism cross-replicas and not only cross-workers (as workers are reside on a replica).
to provide cross-replica "locking mechanism" you may use this within your code, or some redis / etcd as 3rd party.

For our internal stream implementation (v3io) we use a state file which holds all replicas holder ids and using locking mechanism atop of it where each holder try to acquire a specific "shard". if rebalance or fail occur, we perform "resharding" again. This works good, but the pain points are mostly handling errors and recover from errors (no data loss, etc)

@dberardo-com
Copy link
Author

worker id is unique per id and thus could be use a serial number. usage: context.worker_id.

ok, but is this number between min_replicas and max_replicas ? or is it at least a sequential number? i mean, if i have 2 different function (A and B) each with 3 replicas, are the worker_ids of A: 1,2,3 and B: 4,5,6 or are could they be mixed up as in: A: 4, 19, 22, B: 33, 27, 35 ?

Lease

I assume that a service account will be needed in the Pods to use this feature right? or does nuclio already inject a service account token inside every pod ?

For our internal stream implementation (v3io) we use a state file which holds all replicas holder ids and using locking mechanism atop of it where each holder try to acquire a specific "shard". if rebalance or fail occur, we perform "resharding" again. This works good, but the pain points are mostly handling errors and recover from errors (no data loss, etc)

sure enough i could use a shared volume mount, but i wanted to keep it simpler and stateless as much as possible

@liranbg
Copy link
Contributor

liranbg commented Apr 13, 2023

ok, but is this number between min_replicas and max_replicas ? or is it at least a sequential number? i mean, if i have 2 different function (A and B) each with 3 replicas, are the worker_ids of A: 1,2,3 and B: 4,5,6 or are could they be mixed up as in: A: 4, 19, 22, B: 33, 27, 35 ?

It has nothing with replicas but trigger workers only.

I assume that a service account will be needed in the Pods to use this feature right? or does nuclio already inject a service account token inside every pod ?

you will need to create an SA and provide it. you can pass it via UI or function config. the pod will use that from the function config.

sure enough i could use a shared volume mount, but i wanted to keep it simpler and stateless as much as possible

keep in mind that you will need to provide a locking abstraction on top of writing/reading from that file

@dberardo-com
Copy link
Author

keep in mind that you will need to provide a locking abstraction on top of writing/reading from that file

yeah, that might be hard to solve without a "serial number" maybe

It has nothing with replicas but trigger workers only.

mmm i see ... so there is no way to extract a sequence number from the context then. and even if this was possible, then this number will most likely not be updated if new replicas are spawn due to HPA intervention (btw, any news on the autoscaling topic? - #2767)

i can have a more in-depth look at the context object, but from your answer i understand that currently there is no "internal" way to achieve such kind of deduplication / load balancing without the help of an external lock, and also there is no interest into implementing anything like that in nuclio because it is considered an anti-pattern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants