⚡️ Speed up function _calculate_percentile by 10%
#166
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 10% (0.10x) speedup for
_calculate_percentileinmlflow/tracing/utils/__init__.py⏱️ Runtime :
79.7 microseconds→72.2 microseconds(best of101runs)📝 Explanation and details
The optimization achieves a 10% speedup through three key changes that reduce computational overhead:
What was optimized:
Eliminated redundant
not sorted_datacheck: Movedn = len(sorted_data)before the empty check and changedif not sorted_data:toif n == 0:. This avoids callinglen()twice in the common non-empty case.Reduced list access operations: Stored
sorted_data[lower]andsorted_data[upper]in local variables (lower_value,upper_value) instead of accessing them multiple times during interpolation.Simplified interpolation math: Changed from
sorted_data[lower] * (1 - weight) + sorted_data[upper] * weightto the mathematically equivalent but computationally simplerlower_value + weight * (upper_value - lower_value). This reduces from 3 arithmetic operations to 2.Why this leads to speedup:
len()call in the typical execution pathImpact on workloads:
The function is called in a hot path within
add_size_stats_to_trace_metadata()to compute P25, P50, and P75 percentiles for trace span sizes. Since this runs for every trace processed, the 10% improvement compounds significantly in high-throughput tracing scenarios.Test case performance:
The optimization performs best on typical interpolation cases (10-20% faster) but shows slight regression on edge cases like empty lists or percentiles that hit exact boundaries. However, the common case of percentile calculation with interpolation - which represents the primary use case in production - sees consistent improvements.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-_calculate_percentile-mhwzg0wiand push.