[fix](test) Fix flaky adaptive scan profile test#65117
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
Automated review summary: I reviewed the single GitHub-changed file, regression-test/suites/query_profile/adaptive_pipeline_task_serial_read_on_limit.groovy, plus the relevant profile helper, last_query_id() implementation, session-variable forwarding, and BE scan concurrency counter paths. I did not find a substantiated issue that should block this PR.
Critical checkpoint conclusions:
- Goal and test coverage: the PR targets a flaky query-profile lookup by capturing the exact
last_query_id()and polling the matching profile untilProfile Completion State: COMPLETE. That matches the existing profile polling pattern used by the regression framework/plugin helpers and preserves the intended serial/non-serial scan cases. - Scope and clarity: the change is narrowly scoped to the flaky regression test helper and call sites.
- Concurrency/lifecycle: no production concurrency path is changed. The asynchronous profile lifecycle is handled more directly than the previous fixed sleep.
- Configuration/session propagation: the existing test still covers the limit threshold, predicate case, and disabled
enable_adaptive_pipeline_task_serial_read_on_limitcase. The session variables remain normal query options forwarded to BE. - Compatibility/storage/transactions: not applicable; this is test-only and does not change protocol, storage format, transactions, or persisted state.
- Parallel/special paths: I checked the BE
MaxScanConcurrencycounter semantics and the multi-BE topology concern. The remaining tolerant negative assertion predates this PR and is not a new regression from the changed lines. - Test artifacts/style: this assertion-style query-profile regression test does not need a
.outfile. Added patch lines have no trailing whitespace.
Validation: I verified the GitHub PR file list and patch hygiene statically. I could not rerun the regression test in this checkout because thirdparty/installed is missing; the PR description reports timeout 300 ./run-regression-test.sh --run -f regression-test/suites/query_profile/adaptive_pipeline_task_serial_read_on_limit.groovy passed.
User focus points: none were provided beyond the normal full-PR review.
Subagent conclusions: optimizer-rewrite found no candidates. tests-session-config proposed TSC-1 about the negative profile assertion; I dismissed it with code evidence as a pre-existing/topology-tolerant assertion rather than a PR-introduced bug. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set, with no inline comments proposed.
Proposed changes
Stabilize
query_profile/adaptive_pipeline_task_serial_read_on_limitby removing the fixed sleep before reading query profiles. The test now records the exactlast_query_id()after each profiled query, waits for that profile row to reachProfile Completion State: COMPLETE, and then fetches the detailed profile before checkingMaxScanConcurrency.Root cause
The profile list and detailed BE profile are reported asynchronously after query results are returned. Under CI load, the old fixed wait could read
/rest/v1/query_profilebefore the just-finished query appeared, causing intermittentProfile ID ... is not foundfailures.Validation
Result:
Test 1 suites, failed 0 suites.