-
-
Notifications
You must be signed in to change notification settings - Fork 34.4k
assert: prevent OOM when generating diff for large objects #61347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
assert: prevent OOM when generating diff for large objects #61347
Conversation
Add a regression test for an issue where assert.strictEqual causes OOM when comparing objects with many converging paths to shared objects. The test creates an object graph similar to Mongoose documents and verifies that the assertion fails with an AssertionError rather than crashing.
Objects with many converging paths to shared objects can cause exponential growth in util.inspect output. When assert.strictEqual fails on such objects, the error message generation would OOM while trying to create a diff of the 100+ MB inspect strings. Add a 2MB limit to inspectValue() output. When truncation occurs, a marker is added and the error message indicates lines were skipped. The comparison itself is unaffected; only the error output is truncated.
413a099 to
3b6e7b3
Compare
3b6e7b3 to
8905178
Compare
Co-authored-by: Aviv Keller <[email protected]>
|
@avivkeller |
|
I've investigated the CI failures: GitHub Actions failures (coverage-linux, test-linux, test-tarball-linux, x86_64-linux): All terminated with exit code 143 due to runner shutdown signals, not test failures. The tests that completed before shutdown all passed. Jenkins CI failures (node-test-commit-linuxone, node-test-commit-plinux, node-test-linux-linked-*): These show "tests failed" but I cannot access the logs to verify if they're related to this PR or infrastructure issues. Could a maintainer please re-trigger CI or check the Jenkins logs? @avivkeller |
| // Maximum size for inspect output before truncation to prevent OOM. | ||
| // Objects with many converging paths can produce exponential growth in | ||
| // util.inspect output at high depths, leading to OOM during diff generation. | ||
| const kMaxInspectOutputLength = 2 * 1024 * 1024; // 2MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: split the string into chunks of 512kb and compare each chunk and combine the output. The output will not be perfect, as the combining process would sometimes be weird, while the smaller chunks should actually be much much faster to inspect in full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will experiment with that and compare DX, and performance for any objects that are over 512kb. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BridgeAR
I experimented with the chunking approach. Here's what I found:
For large objects (>1MB inspect output), chunking does provide a speedup (1.3-1.5x) when there are many scattered differences. Memory usage per chunk is lower, which helps prevent OOM.
But the problem is that chunks get misaligned. We split actual and expected independently by byte size, but if the values differ in length, the boundaries shift:
ACTUAL CHUNKS EXPECTED CHUNKS
Chunk 0:
prop_01: 'aaa', prop_01: 'aaa',
prop_02: 'bbb', (empty)
Chunk 1:
prop_03: 'ccc', prop_02: 'MUCH_LONGER_VALUE',
prop_04: 'ddd', (empty)
Chunk 2:
prop_05: 'eee', prop_03: 'ccc',
prop_06: 'fff', prop_04: 'ddd',
When we diff Chunk 1 vs Chunk 1, we're comparing [prop_03, prop_04] against [prop_02]. The diff shows fake inserts/deletes at every boundary.
For the typical use case, both approaches are equally fast (1ms each on my machine) because Myers diff is efficient when there are few actual changes.
I'm leaning towards the 2MB truncation approach. It gives users a clear "[truncated]" message when the object is too large, and the diff output is straightforward without these artifacts.
What do you think?
Fixes #61346
Solution
Add a 2MB limit to
inspectValue()output inassertion_error.js. When truncation occurs:... [truncated]marker is added to the outputThe assertion logic is unaffected; only the error output is truncated.
Trade-offs
If both objects produce very large inspect output and happen to match exactly in the first 2MB but differ later, the diff would appear identical. However:
===failed)error.actualanderror.expectedprogrammaticallyAlternative approaches considered