You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The task is to find a better way to solve this problem.
If predict request failed, return 0
drop cluster from available list when predict is not health
If method 1 is used, cluster which replicas is 0 will still in binding cluster, and cannot be removed, either it needs to be removed during the merge process, or there might be other ways to address this.
And if method 2, drop the cluster from available cluster list when one feed predict failed even this subs have many feeds, It is a radical approach when there are only a few child clusters.
The text was updated successfully, but these errors were encountered:
If predict request failed, return 0
If method 1 is used, cluster which replicas is 0 will still in binding cluster, and cannot be removed, either it needs to be removed during the merge process, or there might be other ways to address this.
I'd prefer using method 1 to return 0 replica, which is friendly to current scheduling framework and implementations.
By adding a new flag in struct ClusterScore to indicate such unhealthy predictor cases, all clusters with replicas 0 could be easily pruned in the function RunPredictPlugins.
add post-predict extension point to process predictor unhealthy cluster
What would you like to be added:
If predict http request failed , return an error and cancel scheduling , like this:
https://github.com/clusternet/clusternet/blob/main/pkg/scheduler/framework/plugins/predictor/predictor.go#L128
One cluster predictor failure resulted in a subscription scheduling failure, which is inappropriate.
Why is this needed:
The task is to find a better way to solve this problem.
If method 1 is used, cluster which replicas is 0 will still in binding cluster, and cannot be removed, either it needs to be removed during the merge process, or there might be other ways to address this.
And if method 2, drop the cluster from available cluster list when one feed predict failed even this subs have many feeds, It is a radical approach when there are only a few child clusters.
The text was updated successfully, but these errors were encountered: