New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48155][SQL] AQEPropagateEmptyRelation for join should check if remain child is just BroadcastQueryStageExec #46523
Conversation
… remain child is just BroadcastQueryStageExec
ping @cloud-fan @maryannxue Pls take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think you can provide test cases for this, @AngersZhuuuu ?
|
Added a new UT to show the difference, pls take a look again @dongjoon-hyun |
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala
Show resolved
Hide resolved
...catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
Show resolved
Hide resolved
// Project | ||
// +- LogicalQueryStage(_, BroadcastQueryStage) | ||
// Then after LogicalQueryStageStrategy, will only remain BroadcastQueryStage after project, | ||
// the plan can't execute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simply say
// A broadcast query stage can't be executed without the join operator.
// TODO: we can return the original query plan before broadcast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how hard is it to return the original query plan? Seems not hard as we just need to add a new def returnSingleJoinSide
function in the base class, and unwrap broadcast stage in the AQE rule.
thanks, merging to master! |
What changes were proposed in this pull request?
It's a new approach to fix SPARK-39551
This situation happened for AQEPropagateEmptyRelation when one side is empty and one side is BroadcastQueryStateExec
This pr avoid do propagate, not to revert all queryStagePreparationRules's result.
Why are the changes needed?
Fix bug
Does this PR introduce any user-facing change?
No
How was this patch tested?
Manuel tested
SPARK-39551: Invalid plan check - invalid broadcast query stage
, it can work well without origin fix and current prFor added UT,
before this pr the adaptive plan is
After this patch
Was this patch authored or co-authored using generative AI tooling?
No