Skip to content

[fix](fe) Relax nullable slot check in outer join to anti join#65110

Open
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:outer-to-anti-opt
Open

[fix](fe) Relax nullable slot check in outer join to anti join#65110
morrySnow wants to merge 1 commit into
apache:masterfrom
morrySnow:outer-to-anti-opt

Conversation

@morrySnow

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Related PR: #23833

Problem Summary: ConvertOuterJoinToAntiJoin only allowed conversion when the null-supplying side slot tested by IS NULL was non-nullable. This missed valid cases where the slot is nullable but appears in a null-rejecting binary comparison between left and right join slots, because a NULL value cannot satisfy those join predicates. The fix records slots from non-null-safe comparison predicates between join children, matches child slots to the corresponding outer join output slots, and permits conversion for those null-rejected nullable slots while keeping <=> excluded.

Release note

None

Check List (For Author)

  • Test:
    • Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.ConvertOuterJoinToAntiJoinTest
    • Regression test: ./run-regression-test.sh --run -d nereids_syntax_p0 -s transform_outer_join_to_anti
    • FE Checkstyle: mvn checkstyle:check -pl fe-core
  • Behavior changed: Yes. The optimizer can now convert eligible outer joins with nullable null-supplying slots to anti joins when the slot is null-rejected by a non-null-safe binary comparison join condition.
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: ConvertOuterJoinToAntiJoin only allowed conversion when the null-supplying side slot tested by IS NULL was non-nullable. This missed valid cases where the slot is nullable but appears in a null-rejecting binary comparison between left and right join slots, because a NULL value cannot satisfy those join predicates. The fix records slots from non-null-safe comparison predicates between join children, matches child slots to the corresponding outer join output slots, and permits conversion for those null-rejected nullable slots while keeping <=> excluded.

### Release note

None

### Check List (For Author)

- Test:
    - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.ConvertOuterJoinToAntiJoinTest
    - Regression test: ./run-regression-test.sh --run -d nereids_syntax_p0 -s transform_outer_join_to_anti
    - FE Checkstyle: mvn checkstyle:check -pl fe-core
- Behavior changed: Yes. The optimizer can now convert eligible outer joins with nullable null-supplying slots to anti joins when the slot is null-rejected by a non-null-safe binary comparison join condition.
- Does this need documentation: No
@morrySnow

Copy link
Copy Markdown
Contributor Author

/review

@morrySnow

Copy link
Copy Markdown
Contributor Author

run buildall

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the FE Nereids rewrite rule ConvertOuterJoinToAntiJoin to allow a broader (but still safe) class of outer-join → anti-join conversions when the IS NULL-tested slot on the null-supplying side is nullable in schema but is null-rejected by a non-null-safe comparison join predicate (excluding <=>). This improves rewrite coverage without changing behavior in null-safe cases.

Changes:

  • Extend the rewrite’s eligibility check by collecting “null-rejected” join slots from non-null-safe ComparisonPredicates between left/right children and allowing conversion for those nullable slots.
  • Add FE unit tests covering nullable-slot conversions for = and <, and a non-conversion case for <=>.
  • Add regression EXPLAIN coverage for the new nullable-slot cases and a <=> non-conversion case.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/ConvertOuterJoinToAntiJoin.java Relaxes the nullable-slot restriction by recognizing null-rejected slots from non-null-safe comparisons, while keeping <=> excluded.
fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/ConvertOuterJoinToAntiJoinTest.java Adds unit tests validating conversion for nullable join slots under null-rejecting predicates and preventing conversion for null-safe equality.
regression-test/suites/nereids_syntax_p0/transform_outer_join_to_anti.groovy Adds EXPLAIN assertions ensuring the planner emits ANTI JOIN for eligible nullable-slot cases and retains OUTER JOIN for <=>.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the full PR scope: ConvertOuterJoinToAntiJoin, its FE unit tests, and the regression explain suite. I did not find a blocking issue or a substantiated inline comment to leave.

Critical checkpoint conclusions:

  • Goal/test proof: The PR broadens outer-join-to-anti conversion for nullable schema slots only when the join predicate is a non-null-safe comparison between direct child slots, and adds FE unit plus regression explain coverage for =, <, right-join symmetry, and <=> non-conversion.
  • Scope/focus: The implementation is focused on the eligibility check and keeps the existing rewrite structure, projection/null-alias behavior, ASOF exclusion, and rule registration unchanged.
  • Semantic correctness: The new getNullRejectedJoinSlots path is conservative: it requires ComparisonPredicate, excludes NullSafeEqual, and only accepts direct slot comparisons across the two join children. This matches SQL null rejection for matched rows and leaves nullable non-join-key IS NULL cases unconverted.
  • Parallel paths: Left and right outer joins are both covered; hash and other join conjuncts are both considered. I did not find an aggregate, mark-join, or rule-stage parallel path requiring an additional change for this PR.
  • Compatibility/config/concurrency/persistence: No new config, persistence, protocol, storage format, or concurrency behavior is introduced.
  • Tests: The new tests cover the changed nullable-slot behavior and existing regression style is preserved. I could not run FE unit/regression tests in this checkout because thirdparty/installed and executable thirdparty/installed/bin/protoc are missing; per local FE instructions, build/test invocation must stop in that state. git diff --check passed for the changed files.

User focus points: No additional user-provided review focus was present.

Subagent conclusions: optimizer-rewrite and tests-session-config both found no valuable candidates. After the main pass recorded an empty proposed inline comment set, convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same current ledger/comment set.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29879 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 08c2ecee7dd3f66d6b29b2604029ae67f3dc6d50, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17577	4008	4005	4005
q2	2022	327	194	194
q3	10280	1523	834	834
q4	4680	494	342	342
q5	7514	866	568	568
q6	179	176	138	138
q7	787	849	630	630
q8	9542	1539	1678	1539
q9	5844	4418	4442	4418
q10	6766	1801	1551	1551
q11	515	346	330	330
q12	729	578	438	438
q13	18091	3354	2746	2746
q14	269	257	243	243
q15	q16	790	779	713	713
q17	1039	994	960	960
q18	7037	5768	5508	5508
q19	1329	1319	1075	1075
q20	757	662	613	613
q21	6300	2918	2715	2715
q22	474	374	319	319
Total cold run time: 102521 ms
Total hot run time: 29879 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	5155	4743	4761	4743
q2	300	345	216	216
q3	4999	5273	4796	4796
q4	2093	2162	1365	1365
q5	4998	4681	4702	4681
q6	259	192	133	133
q7	1835	1699	1531	1531
q8	2411	2103	2074	2074
q9	7656	7218	7207	7207
q10	4761	4669	4263	4263
q11	520	386	350	350
q12	747	757	530	530
q13	2972	3367	2723	2723
q14	279	275	253	253
q15	q16	672	695	612	612
q17	1282	1259	1247	1247
q18	7273	6772	6865	6772
q19	1141	1112	1150	1112
q20	2190	2219	1923	1923
q21	5292	4562	4405	4405
q22	523	462	396	396
Total cold run time: 57358 ms
Total hot run time: 51332 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 174562 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 08c2ecee7dd3f66d6b29b2604029ae67f3dc6d50, data reload: false

query5	4386	634	494	494
query6	467	225	201	201
query7	4912	586	344	344
query8	337	185	176	176
query9	8742	4004	4014	4004
query10	474	349	290	290
query11	5927	2336	2161	2161
query12	159	106	113	106
query13	1267	629	432	432
query14	6170	5291	4948	4948
query14_1	4280	4246	4329	4246
query15	209	195	177	177
query16	1045	506	408	408
query17	1099	721	566	566
query18	2449	455	336	336
query19	201	186	146	146
query20	114	110	105	105
query21	230	158	132	132
query22	13714	13575	13431	13431
query23	17334	16471	16143	16143
query23_1	16237	16246	16177	16177
query24	7700	1790	1310	1310
query24_1	1325	1337	1308	1308
query25	576	472	391	391
query26	1335	350	221	221
query27	2611	605	388	388
query28	4434	2029	2000	2000
query29	1105	639	514	514
query30	337	264	229	229
query31	1127	1102	984	984
query32	111	66	66	66
query33	546	326	273	273
query34	1190	1129	662	662
query35	765	795	693	693
query36	1420	1378	1282	1282
query37	161	112	97	97
query38	1880	1716	1675	1675
query39	920	933	899	899
query39_1	883	885	886	885
query40	258	167	147	147
query41	72	69	67	67
query42	98	94	95	94
query43	321	327	281	281
query44	1438	787	788	787
query45	207	201	183	183
query46	1067	1175	749	749
query47	2362	2330	2283	2283
query48	423	433	312	312
query49	609	456	332	332
query50	1034	434	325	325
query51	4410	4386	4412	4386
query52	88	87	78	78
query53	269	282	210	210
query54	304	247	247	247
query55	78	75	68	68
query56	316	309	291	291
query57	1441	1398	1321	1321
query58	285	273	269	269
query59	1570	1636	1395	1395
query60	322	285	262	262
query61	177	171	193	171
query62	690	649	587	587
query63	237	204	207	204
query64	2531	776	617	617
query65	4892	4810	4779	4779
query66	1832	508	399	399
query67	29750	29566	29541	29541
query68	3101	1573	1035	1035
query69	398	310	261	261
query70	1049	957	939	939
query71	392	314	299	299
query72	2941	2855	2382	2382
query73	821	770	465	465
query74	5100	4970	4745	4745
query75	2596	2589	2225	2225
query76	2326	1216	791	791
query77	349	379	282	282
query78	12272	12474	11850	11850
query79	1448	1190	776	776
query80	1292	539	460	460
query81	522	342	293	293
query82	576	156	122	122
query83	391	329	285	285
query84	286	164	131	131
query85	972	600	533	533
query86	419	295	278	278
query87	1830	1822	1771	1771
query88	3688	2819	2793	2793
query89	447	414	355	355
query90	1914	200	195	195
query91	199	186	166	166
query92	62	61	65	61
query93	1524	1565	1058	1058
query94	756	368	325	325
query95	836	493	478	478
query96	1022	805	391	391
query97	2681	2738	2572	2572
query98	217	209	199	199
query99	1140	1173	1042	1042
Total cold run time: 259008 ms
Total hot run time: 174562 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 08c2ecee7dd3f66d6b29b2604029ae67f3dc6d50, data reload: false

query1	0.00	0.00	0.01
query2	0.10	0.06	0.05
query3	0.26	0.13	0.14
query4	1.61	0.14	0.14
query5	0.25	0.26	0.22
query6	1.23	1.07	1.05
query7	0.04	0.00	0.00
query8	0.05	0.04	0.04
query9	0.39	0.33	0.34
query10	0.55	0.58	0.54
query11	0.20	0.15	0.15
query12	0.18	0.14	0.14
query13	0.46	0.48	0.46
query14	1.03	1.01	1.03
query15	0.62	0.59	0.59
query16	0.32	0.32	0.33
query17	1.11	1.14	1.12
query18	0.23	0.21	0.21
query19	2.03	1.94	1.89
query20	0.01	0.01	0.01
query21	15.46	0.23	0.13
query22	4.86	0.05	0.05
query23	16.17	0.32	0.14
query24	2.89	0.44	0.32
query25	0.12	0.05	0.06
query26	0.72	0.21	0.15
query27	0.05	0.03	0.03
query28	3.54	0.93	0.52
query29	12.49	4.31	3.46
query30	0.27	0.15	0.17
query31	2.77	0.60	0.31
query32	3.22	0.61	0.49
query33	3.16	3.22	3.22
query34	15.57	4.26	3.51
query35	3.52	3.49	3.52
query36	0.55	0.43	0.42
query37	0.08	0.06	0.07
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.16	0.15
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 96.54 s
Total hot run time: 25.2 s

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 43.24% (32/74) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants