Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Exception thrown when waiting for execution to finish #5349

Open
2 tasks done
ggydush opened this issue May 10, 2024 · 1 comment
Open
2 tasks done

[BUG] Exception thrown when waiting for execution to finish #5349

ggydush opened this issue May 10, 2024 · 1 comment
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue flyteremote

Comments

@ggydush
Copy link

ggydush commented May 10, 2024

Describe the bug

The following exception is thrown sometimes when executing a workflow with a dynamic that spins up multiple tasks:

           elif e.code() == grpc.StatusCode.NOT_FOUND:
>               raise FlyteEntityNotExistException() from e
E               flytekit.exceptions.user.FlyteEntityNotExistException: USER:EntityNotExist: error=None, cause=<_InactiveRpcError of RPC that terminated with:
E                       status = StatusCode.NOT_FOUND
E                       details = "failed to fetch object: rpc error: code = NotFound desc = request failed with status code 404. Body: {"code":5,"message":"object 's3://union-compute-us-east-1-delvebio/metadata/propeller/clinical-pipeline-development-f76e3bd80dbf548ae8dc/n1/data/0/n3/0/outputs.pb' not found","details":[]}"
E                       debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-05-10T12:35:04.72984-07:00", grpc_status:5, grpc_message:"failed to fetch object: rpc error: code = NotFound desc = request failed with status code 404. Body: {\"code\":5,\"message\":\"object \'s3://union-compute-us-east-1-delvebio/metadata/propeller/clinical-pipeline-development-f76e3bd80dbf548ae8dc/n1/data/0/n3/0/outputs.pb\' not found\",\"details\":[]}"}"
E               >

Expected behavior

No exception when waiting for execution

Additional context to reproduce

execution = remote.execute(
    name,
    inputs=inputs,
    version=version or None,
    **kwargs,
)
remote.wait(execution)

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@ggydush ggydush added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels May 10, 2024
Copy link

runllm bot commented May 10, 2024

The error you're encountering, indicated by FlyteEntityNotExistException, suggests that the system is unable to locate a specific object, likely due to the dynamic task's outputs not being found at the expected location in S3. This could be a result of the object not being created or being deleted before retrieval. This issue is similar to problems discussed in Flyte GitHub issue #3639 where dynamic tasks sometimes lead to missing entities, causing failures when the system expects those entities to exist.

Would you like more detailed information on potential causes and solutions for this issue?


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

@eapolinario eapolinario self-assigned this May 16, 2024
@eapolinario eapolinario added flytekit FlyteKit Python related issue flyteremote and removed untriaged This issues has not yet been looked at by the Maintainers labels May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue flyteremote
Projects
None yet
Development

No branches or pull requests

2 participants