Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel][Feature Request] Revisit how to treat unsupported expressions on partition columns that are part of the query filter #3017

Open
1 of 5 tasks
allisonport-db opened this issue May 2, 2024 · 0 comments
Labels
enhancement New feature or request kernel

Comments

@allisonport-db
Copy link
Collaborator

allisonport-db commented May 2, 2024

Feature request

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Overview

Currently if an unsupported expression involving a partition column is provided as a query filter the query will fail.

For example, if I have a table with partition column "part_1" and data column "data_1":

  • If I provide a query filter over the partition column such as "part_1 % 10 = 0" the entire query will fail with an exception.
  • If I provide a query filter over the data column such as "data_1 % 10 = 0" the query will succeed with less file-pruning, and the query filter will be returned to me as part of the "remaining filter".
    • This is because data skipping only generates filters for supported expressions and returns the rest back as a remaining filter.

We should consider what behavior we expect for unsupported partition filters. Options may include

  • Query succeeds and any unsupported partition filters are returned as part of the remaining filter.
  • Query fails; it is the connector's responsibility to only build Kernel expressions that are supported by the engine expression handler.
    • This is one of the main questions here; who's responsibility is it to ensure expressions are supported? And if it is the connector's, how will they be able to determine this?

Further details

Some work has already been done for this w/o consensus on what behavior we want:

@allisonport-db allisonport-db added enhancement New feature or request kernel labels May 2, 2024
allisonport-db added a commit that referenced this issue May 3, 2024
…3018)

<!--
Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md
2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]
Your PR title ...'.
  3. Be sure to keep the PR description updated to reflect all changes.
  4. Please write your PR title to summarize what this PR proposes.
5. If possible, provide a concise example to reproduce the issue for a
faster review.
6. If applicable, include the corresponding issue number in the PR title
and link it in the body.
-->

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the pull
request title
For example: [Spark] Title of my pull request
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [X] Kernel
- [ ] Other (fill in here)

## Description

Removes a currently unused API in `ExpressionHandler`.

See #3017 for details. **TLDR**
we added this API before achieving consensus on what we want to
implement and it is currently unused. Remove it for now and we can add
it back later if needed.

## How was this patch tested?

N/A

## Does this PR introduce _any_ user-facing changes?

No
allisonport-db added a commit to allisonport-db/delta that referenced this issue May 4, 2024
…elta-io#3018)

<!--
Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md
2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]
Your PR title ...'.
  3. Be sure to keep the PR description updated to reflect all changes.
  4. Please write your PR title to summarize what this PR proposes.
5. If possible, provide a concise example to reproduce the issue for a
faster review.
6. If applicable, include the corresponding issue number in the PR title
and link it in the body.
-->

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the pull
request title
For example: [Spark] Title of my pull request
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [X] Kernel
- [ ] Other (fill in here)

## Description

Removes a currently unused API in `ExpressionHandler`.

See delta-io#3017 for details. **TLDR**
we added this API before achieving consensus on what we want to
implement and it is currently unused. Remove it for now and we can add
it back later if needed.

## How was this patch tested?

N/A

## Does this PR introduce _any_ user-facing changes?

No

(cherry picked from commit e7cafec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request kernel
Projects
None yet
Development

No branches or pull requests

1 participant