Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote logging doesn't work #835

Open
2 tasks done
jkleinkauff opened this issue Mar 6, 2024 · 2 comments
Open
2 tasks done

Remote logging doesn't work #835

jkleinkauff opened this issue Mar 6, 2024 · 2 comments
Labels
kind/bug kind - things not working properly

Comments

@jkleinkauff
Copy link

jkleinkauff commented Mar 6, 2024

Checks

Chart Version

8.8.0

Kubernetes Version

Client Version: v1.29.0-eks-5e0fdde
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5-eks-5e0fdde

Helm Version

version.BuildInfo{Version:"v3.12.3", GitCommit:"3a31588ad33fe3b89af5a2a54ee1d25bfe6eaa5e", GitTreeState:"clean", GoVersion:"go1.20.7"}

Description

Hey folks!
I'm bumping our Airflow version from 2.2.5 to 2.7.3. - I also upgraded the chart version to 8.8.0.

During this upgrade, it seems remote logging is not working anymore. I'm unsure in which version it stopped, tho.

My config:

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "aws"
    AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "True"

  connections:
    - id: aws
      type: aws

in helm_release:

  set {
    name  = "airflow.config.AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER"
    value = "s3://${var.s3_bucket_logs}"
  }

  set {
    name  = "airflow.connections[0].login"
    value = aws_id
  }

  set_sensitive {
    name  = "airflow.connections[0].password"
    value = aws_secret
  }

policy:

    actions = [
      "s3:ListBucket",
      "s3:List*",
      "s3:Get*",
      "s3:PutObject*",
      "s3:PutBucketPublicAccessBlock",
      "s3:GetBucketAcl",
      "s3:GetBucketLocation",
      "s3:DeleteObject",
    ]

    resources = [
      "arn:aws:s3:::loadsmart-airflow-logs-${var.env}/*",
]

I know the logs are specific about the permissions, but there is more to it as this was the same setting it was working on before. I would love any tips, thank you friends!

Relevant Logs

kubectl logs airflow-staging-web-7bd6d8c75b-c9bst -n airflow:

[2024-03-06T00:17:13.350+0000] {base.py:73} INFO - Using connection ID 'aws' for task execution.
[2024-03-06T00:17:13.351+0000] {connection_wrapper.py:378} INFO - AWS Connection (conn_id='aws', conn_type='aws') credentials retrieved from login and password.
[2024-03-06T00:17:14.267+0000] {app.py:1744} ERROR - Exception on /api/v1/dags/airflow_db_cleanup_dag/dagRuns/manual__2024-03-06T00:16:59.449220+00:00/taskInstances/print_configuration/logs/1 [GET]
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/decorator.py", line 68, in wrapper
    response = function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/uri_parsing.py", line 149, in wrapper
    response = function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/validation.py", line 399, in wrapper
    return function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/response.py", line 113, in wrapper
    return _wrapper(request, response)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/response.py", line 90, in _wrapper
    self.operation.api.get_connexion_response(response, self.mimetype)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/apis/abstract.py", line 366, in get_connexion_response
    return cls._framework_to_connexion_response(response=response, mimetype=mimetype)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/apis/flask_api.py", line 165, in _framework_to_connexion_response
    body=response.get_data() if not response.direct_passthrough else None,
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 314, in get_data
    self._ensure_sequence()
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 376, in _ensure_sequence
    self.make_sequence()
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 391, in make_sequence
    self.response = list(self.iter_encoded())
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 50, in _iter_encoded
    for item in iterable:
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/log_reader.py", line 87, in read_log_stream
    logs, metadata = self.read_log_chunks(ti, current_try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/log_reader.py", line 64, in read_log_chunks
    logs, metadatas = self.log_handler.read(ti, try_number, metadata=metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/file_task_handler.py", line 413, in read
    log, out_metadata = self._read(task_instance, try_number_element, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/log/s3_task_handler.py", line 149, in _read
    return super()._read(ti, try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/file_task_handler.py", line 313, in _read
    remote_messages, remote_logs = self._read_remote_logs(ti, try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/log/s3_task_handler.py", line 123, in _read_remote_logs
    keys = self.hook.list_keys(bucket_name=bucket, prefix=prefix)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 89, in wrapper
    return func(*bound_args.args, **bound_args.kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 820, in list_keys
    for page in response:
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/paginate.py", line 269, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/paginate.py", line 357, in _make_request
    return self._method(**current_kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/client.py", line 980, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

Custom Helm Values

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "aws"
    AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "True"

  connections:
    - id: aws
      type: aws
@jkleinkauff jkleinkauff added the kind/bug kind - things not working properly label Mar 6, 2024
@tusharraichand
Copy link

I have similar issue but with GCP, my logs are being pushed to GCS bucket but airflow UI can not read them. I see the following error on airflow UI.
*** No logs found in GCS; ti=%s <TaskInstance: xyz> *** Could not read served logs: [Errno -2] Name or service not known
Also I am new to helm, whats the correct way of passing these values to values.yaml:

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "gs://airflow/logs"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "my_conn"

  connections:
    - id: my_conn
      type: google_cloud_platform
      description: my GCP connection

When I add these values to values.yaml and run a helm upgrade, I see no changes happening to any pod/deployment.

@b0kky
Copy link

b0kky commented Mar 18, 2024

+1
found workaround pass through creds as ENV
#833 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug kind - things not working properly
Projects
None yet
Development

No branches or pull requests

3 participants