New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make timeout for check-db configurable, and ensure logs are written when timeout happens #615
Comments
@thesuperzapper I can make a PR for this, but which approach would you prefer I take? (if you don't care I can just remove the timeout, which is my preference) |
We can pick the timeout value from values.yaml by using something like .Values.checkDb.timeout and add if else condition to use default timeout value? |
@asosnovsky-sumologic @anuragphadnis Thanks for raising this, I am not sure why I did not introduce a value to change the timeout length when I first introduced the timeout. However, I do remember that there are situations when While you are correct that the airflow Dockerfile has an in-built check-db, this is not suitable for the chart for 2 primary reasons:
My thought is that we can do two things:
Other thoughts:
|
In my case the check-db pod gets restarted after every 60 seconds and connection is not being able to establish in this time. This happens every time after check-db is restarted, and check-db does not log anything. It would be very useful if we can customize the value so that we can see logs or establish the connection. |
I agree with this line of thinking. Like do something along the lines of {{- if .Values.airflow.legacyCommands }}
- "exec timeout {{mul .Values.config.CONNECTION_CHECK_SLEEP_TIME .Values.config.CONNECTION_CHECK_MAX_COUNT}}s airflow checkdb"
{{- else }}
- "exec timeout {{mul .Values.config.CONNECTION_CHECK_SLEEP_TIME .Values.config.CONNECTION_CHECK_MAX_COUNT}}s airflow db check"
{{- end }} Or even provide these values under # value.yaml
airflow:
db_connection:
check_max_count: 10
check_sleep_time: 30 |
Yes @asosnovsky Right now I have modified it in a similar way to use value from config and using that code to deploy the chart. I can create a PR if someone else is not working on it. |
@anuragphadnis i don't mind taking this to completion. If your change is already merged, I can do the rest of the work and update the checkdb command :) |
@asosnovsky @anuragphadnis @asosnovsky-sumologic If anyone wants to pick up the tasks I suggested in #615 (comment), I am happy to review/merge the PR, the tasks are:
|
Checks
User-Community Airflow Helm Chart
.Motivation
In our company we are currently setting up a cluster for airflow, and while debugging some networking issues between our cluster and rds we seem to constantly hit the timeout settings in here https://github.com/airflow-helm/charts/blob/main/charts/airflow/templates/_helpers/pods.tpl#L72-L75 , which once hit hides all of the important logs that we would have received if the process failed on it's on.
The way we handle it right now is by manually editing the container's command and setting the timeout to something higher.
Implementation
Here are some different ideas:
airflow.check-db
that would allow us to set a custom timeoutCONNECTION_CHECK_SLEEP_TIME
andCONNECTION_CHECK_MAX_COUNT
variables (see https://airflow.apache.org/docs/docker-stack/entrypoint.html#waits-for-airflow-db-connection ) -- I prefer this solutionAre you willing & able to help?
The text was updated successfully, but these errors were encountered: