⚡️ Speed up function find_last_node by 18,969%
#229
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 18,969% (189.69x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
75.3 milliseconds→395 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 190x speedup by eliminating a critical algorithmic inefficiency: replacing an O(N×M) nested iteration with O(N+M) operations using a set-based lookup.
Key Optimization
Original approach: For each node, iterate through ALL edges to check if that node is a source
all(e["source"] != n["id"] for e in edges)check runs M comparisons for each of the N nodesOptimized approach: Pre-build a set of all edge sources once, then check membership
edge_sourcesset in O(M) timeWhy This Matters
The performance difference becomes dramatic as graph size increases:
Additional Change
The code uses
n.get("id")instead ofn["id"]to handle nodes missing the "id" key gracefully, maintaining the same behavior as the original code which only accessedn["id"]during the comparison check. This prevents KeyError exceptions when processing malformed node data while preserving the performance benefit.The optimization is universally beneficial across all test cases (except trivially empty graphs where overhead slightly increases by 5-9%). It's especially impactful for graphs with many edges or when checking nodes late in the iteration order.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mjnd1wp9and push.