test: improve some tests + retry others #10239
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related Issues
I analyzed test failures on main from November 1st and I found out that some integration tests are responsible of most failures. So I am trying to improve them.
Proposed Changes:
test_run_async_cancellation_integration: use a faster model to make sure that generation already started when we cancel the tasktest_live_run_with_agent_streaming_and_reasoning: define the function used here at the module level, not inside the test function. Otherwise it frequently fails with "Serialization of nested functions is not supported"test_live_run_with_toolset: relax the requirements for the "city" value: sometimes the model produces "Paris, France" insted of "Paris" onlyLinkContentFetcher: mark tests as flaky to retry them. In these cases, failures are often due to network problems.How did you test it?
CI, multiple local runs
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.