Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48213][SQL] Do not push down predicate if non-cheap expression exceed reused limit #46499

Closed
wants to merge 1 commit into from

Conversation

zml1206
Copy link
Contributor

@zml1206 zml1206 commented May 9, 2024

What changes were proposed in this pull request?

Avoid push down predicate if non-cheap expression exceed reused limit. Push down predicate through project/aggregate need replace expression, if the expression is non-cheap and reused many times, the cost of repeated calculations may be greater than the benefits of pushdown predicates.

Why are the changes needed?

Like #33958, to avoid performance regression caused by repeated evaluation of expensive expressions and larger plans such as case when nested, the difference is that push down will have additional benefits, so add limit of reused count conf instead of 1.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 9, 2024
@zml1206
Copy link
Contributor Author

zml1206 commented May 16, 2024

cc @cloud-fan

@cloud-fan
Copy link
Contributor

I think #45802 (comment) is a better idea.

@zml1206
Copy link
Contributor Author

zml1206 commented May 17, 2024

with is a good idea, thank you very much @cloud-fan . Close it.

@zml1206 zml1206 closed this May 17, 2024
@zml1206
Copy link
Contributor Author

zml1206 commented May 22, 2024

@cloud-fan I tried to push down predicate through With, however, found that when there is complex nesting, legal With cannot be generated, common expression exists in both child and defs of With.
For example

SELECT *
FROM (
	SELECT c, c * c AS d
	FROM (
		SELECT a + a AS c
		FROM t1
	) t2
	GROUP BY c
) t3
WHERE d + d > c + c

@cloud-fan
Copy link
Contributor

Can we make With nested as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants