Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use the same runner when the script failed #1514

Merged
merged 2 commits into from
Apr 30, 2024
Merged

Commits on Apr 30, 2024

  1. Add a helper to create spare runner

    Our runners are job agnostic, meaning they can run any job with the
    matched label. This enables us to create a spare runner with the same
    label if the initial one doesn't function properly.
    
    This helper is also useful for on-call engineers.
    enescakir committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    aefdea6 View commit details
    Browse the repository at this point in the history
  2. Do not use the same runner when the script failed

    Currently, when the script failed, we assume it failed because of an
    initialization error, and we try to register the same runner again.
    This is not always true. The script might be "failed" while running the
    workflow. We should create a new spare runner and destroy the failed
    one.
    
    I can implement additional checks such as verify if the runner has
    completed the job, and avoid creating an spare runner if the job is
    completed. However, runner script failures are uncommon. Even when they
    occur, they exit with a zero exit code, not a non-zero one. Therefore, I
    believe it's currently unnecessary to add more checks. If script
    failures increase, we can reconsider adding them.
    enescakir committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2b5d118 View commit details
    Browse the repository at this point in the history