You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the current Tekton backend workflow implementation, all steps in a workflow need to be executed on the same kubernetes node. This limitation is enforced by Tekton, because all steps need to mount the PVC associated with the pipeline workspace where the codeset is cloned.
This can have serious performance consequences, and can even lead to pod scheduling pipeline failures e.g. when a workflow step requires resources (GPUs/CPUs) that are not available on the node where the pipeline run is scheduled.
Luckily, all steps in a workflow are executed in sequence (there's no support for parallel workflow steps yet), so at least this doesn't reduce the degree of parallelism of FuseML workflows.
A different strategy should be investigated, for example one that doesn't automatically map the workspace PVC if it's not needed. Alternatively, distributed storage should be used if available.
The text was updated successfully, but these errors were encountered:
stefannica
changed the title
All workflow step need to execute on the same kubernetes node
All workflow steps need to execute on the same kubernetes node
Nov 15, 2021
With the current Tekton backend workflow implementation, all steps in a workflow need to be executed on the same kubernetes node. This limitation is enforced by Tekton, because all steps need to mount the PVC associated with the pipeline workspace where the codeset is cloned.
This can have serious performance consequences, and can even lead to pod scheduling pipeline failures e.g. when a workflow step requires resources (GPUs/CPUs) that are not available on the node where the pipeline run is scheduled.
Luckily, all steps in a workflow are executed in sequence (there's no support for parallel workflow steps yet), so at least this doesn't reduce the degree of parallelism of FuseML workflows.
A different strategy should be investigated, for example one that doesn't automatically map the workspace PVC if it's not needed. Alternatively, distributed storage should be used if available.
The text was updated successfully, but these errors were encountered: