fix: add assert to HashJoinExec::swap_inputs#23078
Conversation
| } | ||
|
|
||
| #[test] | ||
| fn test_swap_inputs_clears_dynamic_filter() -> Result<()> { |
There was a problem hiding this comment.
Could we add an end-to-end test that explains why this behavior is desired — that swap_inputs() should drop the dynamic filter? It’s a bit hard to infer the test goal from this unit test alone.
It should be some SQL queries that used to have bug in results, but enforcing this fix make them correct.
There was a problem hiding this comment.
i try it, but not easy to reproduce with the sql. this bug is trigger when ser/de the physical plan fc93043 (you can check this test case for more detail). i will give more try to reproduce this by sql.
There was a problem hiding this comment.
I'm wondering: are you using swap_inputs() directly downstream? 🤔
I think the root cause is that swap_inputs() has not been clearly specified. It implicitly assumes a bunch of preconditions, so it is only mostly safe to use internally when we strictly follow the optimizer rule order.
However, it is now a public API, so downstream usages may trigger subtle bugs if users are unaware of those preconditions.
See the optimizer rule order below. Internally, swap_inputs() is only called inside that optimizer rule, so we can ensure it is called only when dynamic_filter is None:
I tried adding assert!(self.dynamic_filter.is_none()) at the beginning of HashJoinExec::swap_inputs(), and the internal tests still pass.
So I suggest we discuss separately what extra contract you need for downstream usage, and then extend swap_inputs() accordingly. (and also add the above assertions to prevent similar issues)
There was a problem hiding this comment.
yes, we directly call the swap_inputs() after the dynamic filter pushdown rule(by add a new physical optmizer rule via SessionStateBuilder::with_physical_optimizer_rule).
There was a problem hiding this comment.
I opened a separate issue to discuss the broader public API contract of HashJoinExec::swap_inputs() for downstream optimizer usage: #23105
For this PR, should we keep it scoped to making swap_inputs() drop the stale dynamic filter and documenting that behavior, or would you prefer this PR only add an assertion/precondition for now?
There was a problem hiding this comment.
I think for now we should only assert that the dynamic filter is None when calling swap_inputs().
If your use case can follow the existing restriction, it should be both simpler and safer: it is very natural to first decide the join order and then construct dynamic filters, rather than build some dynamic filters -> swap the join order -> build the remaining dynamic filters.
This likely means ensuring extension physical optimizer rules follow the same restriction: dynamic filter construction rules should be placed after rules that swap join input orders.
If there are cases that cannot be solved this way, we can discuss them separately.
There was a problem hiding this comment.
apply suggestion in 1d85fc7
follow the existing restriction will solve the issue, but i must copy the below code and insert my optimizer rule into it😂.
datafusion/datafusion/physical-optimizer/src/optimizer.rs
Lines 144 to 250 in 46b508e
There was a problem hiding this comment.
Yes I agree, adding custom rule is quite tricky.
There are lots of implicit assumptions like 'rule A must be run after rule B', so if your extension rule might require certain properties, it's a good idea to add more tests and assertion guards to DataFusion core.
There was a problem hiding this comment.
There are lots of implicit assumptions like 'rule A must be run after rule B', so if your extension rule might require certain properties, it's a good idea to add more tests and assertion guards to DataFusion core.
Good idea
HashJoinExec::swap_inputs
2010YOUY01
left a comment
There was a problem hiding this comment.
Thanks! Should be good to go after CI passes.
|
Thanks for your review @2010YOUY01 @Dandandan |
Which issue does this PR close?
HashJoinExec::swap_inputspreserves old dynamic filter after swap #23077Rationale for this change
Dynamic filters are runtime state tied to the probe side.
swap_inputschanges the probe side, so preserving the old filter is unsafe and can lead to wrong column references.What changes are included in this PR?
remove the
dynamic_filterwhile swap inputs inHashJoinExec:Are these changes tested?
yes, add one test case
Are there any user-facing changes?
no